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PREFACE TO THE FOURTH EDITION 


This fourth edition contains several additions. The main ones con¬ 
cern three closely related topics: Brownian motion, functional limit 
distributions, and random walks. Besides the power and ingenuity of 
their methods and the depth and beauty of their results, their importance 
is fast growing in Analysis as well as in theoretical and applied Proba¬ 
bility. 

These additions increased the book to an unwieldy size and it had to 
be split into two volumes. 

About half of the first volume is devoted to an elementary introduc¬ 
tion, then to mathematical foundations and basic probability concepts 
and tools. The second half is devoted to a detailed study of Independ¬ 
ence which played and continues to play a central role both by itself and 
as a catalyst. 

The main additions consist of a section on convergence of probabilities 
on metric spaces and a chapter whose first section on domains of attrac¬ 
tion completes the study of the Central limit problem, while the second 
one is devoted to random walks. 

About a third of the second volume is devoted to conditioning and 
properties of sequences of various types of dependence. The other two 
thirds are devoted to random functions; the last Part on Elements of 
random analysis is more sophisticated. 

The main addition consists of a chapter on Brownian motion and limit 

distributions. 

It is strongly recommended that the reader、begin with less involved 
portions. In particular, the starred ones ought to be left out until they 
are needed or unless the reader is especially interested in them. 

I take this opportunity to thank Mrs. Rubalcava for her beautiful 
typing of all the editions since the inception of the book, I also wish to 
thank the editors of Springer-Verlag，New York, for their patience and 

care. 

M 丄 

January, 1977 
Berkeley^ California 




PREFACE TO THE THIRD EDITION 


This book is intended as a text for graduate students and as a reference 
for workers in Probability and Statistics. The prerequisite is honest 
calculus. The material covered in Parts Two to Five inclusive requires 
about three to four semesters of graduate study. The introductory part 
may serve as a text for an undergraduate course in elementary prob¬ 
ability theory. 

The Foundations are presented in ： 

the Introductory Part on the background of the concepts and prob¬ 
lems, treated without advanced mathematical tools; 

Part One on the Notions of Measure Theory that every probabilist 
and statistician requires; 

Part Two on General Concepts and Tools of Probability Theory. 

Random sequences whose general properties are given in the Founda¬ 
tions are studied in: 

Part Three on Independence devoted essentially to sums of inde¬ 
pendent random variables and their limit properties; 

Part Four on Dependence devoted to the operation of conditioning 
and limit properties of sums of dependent random variables. The 
last section introduces random functions of second order. 

Random functions and processes are discussed in: 

Part Five on Elements of random analysis devoted to the basic con¬ 
cepts of random analysis and to the martingale, decomposable, 
and Markov types of random functions. 

Since the primary purpose of the book is didactic, methods are 
emphasized and the book is subdivided into: 

unstarred portions, independent of the remainder; starred portions, 
which are more involved or more abstract; 

complements and details, including illustrations and applications of 
the material in the text, which consist of propositions with fre- 



PREFACE TO THE THIRD EDITION 


quent hints; most of these propositions can be found in the 
articles and books referred to in the Bibliography. 

Also, for teaching and reference purposes, it has proved useful to name 
most of the results. 

Numerous historical remarks about results, methods, and the evolu¬ 
tion of various fields are an intrinsic part of the text. The purpose is 
purely didactic: to attract attention to the basic contributions while 
introducing the ideas explored. Books and memoirs of authors whose 
contributions are referred to and discussed are cited in the Bibliography, 
which parallels the text in that it is organized by parts and, within parts, 
by chapters. Thus the interested student can pursue his study in the 
original literature- 

This work owes much to the reactions of the students on whom it has 
been tried year after year. However, the book is definitely more concise 
than the lectures, and the reader will have to be armed permanently 
with patience, pen, and calculus. Besides, in mathematics, as in any 
form of poetry, the reader has to be a poet in posse. 

This third edition differs from the second (1960) in a number of 
places. Modifications vary all the way from a prefix (“sub” martingale 
in lieu of ‘‘semi’’-mart ： ingale) to an entire subsection (§36.2). To pre¬ 
serve pagination, some additions to the text proper (especially 9, p. 656) 
had to be put in the Complements and Details. It is hoped that more¬ 
over most of the errors have been eliminated and that readers will be 
kind enough to inform the author of those which remain. 

I take this opportunity to thank those whose comments and criticisms 
led to corrections and improvements: for the first edition, E. Barankin, S. 
Bochner, E. Parzen, and H, Robbins; for the second edition, Y. S. Chow, 
R. Cogburn, J, L. Doob, J. Feldman, B. Jamison, J. Karush, P. A. Meyer, 
J, \V. Pratt, B. A. Sevastianov, J. W. Woll; for the third edition, S, 
Dharmadhikari, J. Fabius, D. Freedman, A. Maitra ， U‘ V. Prokhorov. 
My warm thanks go to Cogburn, whose constant help throughout the 
preparation of the second edition has been invaluable. This edition has 
been prepared with the partial support of the Office of Naval Research 
and of the National Science Foundation, 

M. L. 

April, 1962 
Berkeley , California 
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Part Four 


DEPENDENCE 


For about two centuries probability theory has been concerned almost 
exclusively with independence. Yet, very particular forms of depend¬ 
ence appear already in the theory of games of chance. But a first 
general type of dependence ― chains — was introduced only at the be¬ 
ginning of this century by Markov. Another type of dependence — 
stationarity — appears in ergodic theory, and a related type — second 
order stationarity — is then introduced in probability theory by 
Khintchine (1932). Centering at conditional expectations by P. L6vy 
(1935) gives rise to a new type of dependence — martingales. 

At the very core of the study of dependence lies the concept of con¬ 
ditioning 一 - with respect to a function — put in an abstract and rigorous 
form by Kolmogorov. In this part, the concept of conditioning is in¬ 
troduced in a more general form — with respect to a cr-field 一 and，as 
much as possible, the properties of various types of dependence are re¬ 
lated to more general results, with emphasis given to the methods. 





CONDITIONING 


§27. CONCEPT OF CONDITIONING 


The concept of “conditioning” can be expressed in terms of sub 
cr-fields of events. Conditional probabilities of events and conditional 
expectations of r.v/s “given a <r-field (B, >， to be introduced and investi¬ 
gated in this chapter, are (B-measurable functions defined up to an 
equivalence:. If (B is determined by a countable partition of the sure 
event, then these functions are elementary. In this “elementary case,” 
a constructive approach with a definite intuitive appeal is possible and 
there are no technical difficulties. In the general case, there is no suit¬ 
able and rigorous constructive approach, and a descriptive one, requiring 
more powerful tools, especially the Radon-Nikodym theorem, has to be 
used. 

The R.-N. theorem was obtained in its abstract form in 1930 and the 


concept of conditional probabilities and of conditional expectations of 
integrable r,v/s “given” a measurable function, finite or not, numerical 
or not, was then put on a rigorous basis by Kolmogorov in 1933. 

27.1. Elementary case. Investigation of the elementary case will give 
us an insight into the ideas involved in the intuitive notion of condi¬ 
tioning and will lead “naturally” to the notions and problems which 
appear in the general case. 

The notion of conditional probability of an event A ‘‘given an event 
B” corresponds to that of frequencies of A in the repeated trials where 


B occurs; it is one 
A y the relation 


of the oldest probability notions. 

PB-PbA = PAB 


For every event 


defines the conditional probability (c.pr.) PbA of A given B as the ratio 
PAB/PBy provided 5 is a nonnull event; if B is null ， so is AB y and the 
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foregoing relation leaves Pb^ undetermined. In what follows, we 
assume that, unless otherwise stated, B is nonnull. 

The function Pb on the <r-field d of events, whose values are Pb^j 
yf C is called conditional pr. given B. The defining relation shows 
at once that since P on a is normed, nonnegative, and (T-additive, so 
is Pb on a ： 

Pb^ =1 ， Pb H Pb IZ = 12 Pb^j- 

Thus, the conditioning expressed by “given JB” means that the initial 
pr. space (fi, d y P) is replaced by the pr. space (12, % Pb)- The expec¬ 
tation, if it exists, of a r.v. X on this new pr. space is called conditional 
expectation (c.exp.) given B and is denoted by EbX; in symbols 


E b X 




dPs. 


Since Pb = 0 on \AB c y A ^d) y the right-hand side reduces to I XcIPb 

Jb 

P on \AB. A C it becomes — | XdP. 
PB PBJb 


and, since Pb 


=: ， 


Therefore, the c.exp. of X given B can be defined directly by 


PBEbX = I XdP 



and is determined if 5 is a nonnull event. In particular. 


PBEbIa PAB 

so that the c.pn PbA can be defined, thereafter, by 

FbA = EbIa- 

Thus, if Eb is the c.exp. given B ，with values EbX on the family 6^ of 
all r.v/s X whose integral on B exists, the c.pr. Pb becomes the re¬ 
striction of Eb to the family I a of indicators of events. Furthermore, 
properties of Pb become particular cases of the immediate properties 

of Eb below. 

If X 0 then EbX ^ 0, and if c is a constant then Eb c — c. If the 
Xj are nonnegative，or if the Xj are intertable and their consecutive sums 
are uniformly bounded by an intenable r.v” then Eb XU* == E EaXj. 

C.exp/s (hence c.pr/s) acquire their full meaning when reinterpreted 
as values of functions, as follows. The number E B X is no longer assigned 
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to B but to every point of B y and similarly for Eb c X } so that we have a 
two-valued function on 12, with values EqX for co C -5 and Eb c X for 
co C B c . More generally, let {5；} be a countable partition of and 
let (B be the minimal <r-field over this partition. Let 6 be the family of 
all r.v/s X whose expectation EX exists，so that their indefinite inte¬ 
grals, hence c.exp.’s given any nonnull event, exist. Consider the ele¬ 
mentary functions 

^ = E {E Bi X)I Biy XC&. 

If some Bj are null, then the corresponding values E〜X are undeter¬ 
mined, so that E^X is undetermined on the null event which is the sum 
of null Bj. Such a possibility, together with the definition of EsjXy 
leads to the following 

Constructive definition. The elementary function E^X defined 
up to an equivalence by 

⑴ 於 xcs> 

is the c.exp. of X given (B. 

Upon particularizing to indicators, the (B-measurable function A y 
defined up to an equivalence by setting 

P^A = £^I Ay ACa y 

will be the c.pr. of A given (B; the contraction of on I Qy to be denoted 
by P^ y will be the c.pr. given (B, and its values are the (B-measurable 
functions P^A y yf C defined up to an equivalence. 

We say “given (the <r-field) ®” and not “given (the partition) {Bj} y >y 
because E^X determines the c.exp. of X given an arbitrary nonnull 
event 5 C In fact, if Yf denotes the summation over some sub¬ 
class of then every event J5 C ® is of the form B Jy and we 

have 

PBEsX » f XdP^Z f ( XdP^Y, f PBjE Sj X. 

JZ f Bj JBj 

This relation can also be written as follows; If P© is the restriction of 
P to (R y defined by 

P(^B = PB y S C 

then the right-hand side becomes J (E^X) dP^ while the left-hand side 
is f XdP. This leads to the following 
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Descriptive definition. The c.exp. E^X of X C. ^ given (R is any 
(S>-measurable function whose indefinite integral with respect to P® is the 
restriction to (S> of the indefinite integral of X with respect to P. Since 
the indefinite integral with respect to P® of a ^measurable function deter¬ 
mines this function up to an equivalence y this definition means precisely 
that y for every X is defined by 


f (£^X) dP^ =： f XdP y 


To conclude the discussion of the elementary case，we revert to the 
initial approach, first defining c.pr/s and, then, defining c.exp.’s as 
integrals. According to what precedes, we define P® on d either by 

J CP®/) dP a = PAB, AC dy BC(S> 


or, equivalently, by 




up to an equivalence. 

Let Bq be the null sum of all null Bj and, for every A y select P ®A 
within its equivalence class by taking its values P^A at w C Bj to be 
PABj/PBj if Bj is nonnull and PA if Bj is null (c5q)* Then，for every 
co C the function on GL y with values is a probability and 

we can form integrals with respect to it. Let X & and set 


EfX^JxdP^ y a) Cfi. 


Since，for every w Cl not contained in the null event we have 

J "”=▲£，#， 

it follows that the function on with values Ej^X belongs to the equiva¬ 
lence class of E^X. Thus，we can define X to bt P (^equivalent to 

the integral JX dP^ where P®, hence the integral, are functions of 
a) C in symbols 

fxJP^ a-s. 



(Sec. 27] 


CONDITIONING 


27,2. General case. The constructive approach fails at the very start 
as soon as the “given” <r-fields are not generated by countable partitions. 
However, the descriptive approach remains possible, thanks to the 
Radon-Nikodym theorem. 

Let (fi, d y P) denote，as usual, the pr, space. Let (B, with or without 
affixes, denote a <r-field contained in d y and let P® denote the restric¬ 
tion of P to (R. Finally, let 6 be the family of all ft-measurable func¬ 
tions whose integral (hence indefinite integral) exists. 

Definition. The c.exp. of C S given (B is a (S>-measurable 

function、defined up to a P^equivalence by 

(1) f (E^X) dP^ = f XdP y BC(R. 

It follows at once that 

1 。 E(^X) = EX. 

2 。 1/ (S> = d or X is (S^measurable y then X = X a.s. 

3° = B^X + - a.s. 

The definition is justified: the indefinite integral <p of X being <r-addi- 
tive and P-continuous, its restriction to (B is <r-additive and P®- 
continuous，the extended Radon-Nikodym theorem applies, and the 
(B-measurable function E^X defined by (1) exists and is defined up to a 
户 ©-equivalence. 

If 例 a is a-finite，then the Radon-Nikodym theorem applies, so that ， 
moreover, is finite except on an arbitrary null event belonging to 
(B. If ^ is a r.v.，then ip is a-finite，but this does not imply that is 
a-finite: take = oo and (B = {0 ， 12}. However, such a possibility 
is excluded in the case of integrable r.v/s for, then, ip and hence ^ are 
bounded. 

We observe that as soon as it is understood that E^X is, by definition, 
a (B-measurable function, we can replace by P y and properties of 
E^X valid, except on a P^-null event, may and will be said “a.s.” 

The function on 6 to the space of ©-measurable functions (more 
precisely, on the space of equivalence classes of a - measurable func¬ 
tions possessing an integral to the space of equivalence classes of 
(B-measurable functions) will be called c.exp. given (B, can also be 
considered as a function on X S to 犮 = [— 00 , + 00 】 with values Ej^X 
for co C ^ C S, the value for all o) belonging to an arbitrary P(B-null 
event being arbitrary. 

The restriction of to the family I a of indicators of events is 
called c.pr. given (B and is denoted by P®; in other words ， P® is a func- 
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tion on a whose values are (B-measurable functions P^A defined up to 
an equivalence by r A = E^Ia or, directly, by 


f (?®^) ^ 


= PAB y 5 CCB. 


Extension. It is “natural” to require of the definition of c.exp.’s 
that E Q X = X a.s.，whatever be the measurable function X Yet, the 
foregoing definition does not apply to those X whose integrals do not 
exist- However, it is possible to extend the definition so as to achieve 
the foregoing requirement, as follows: Write X = ― X— where, 

as usual, and X 一 are the positive and negative parts of X, respec¬ 
tively; and always exist but may be infinite. Define E^X 

by 於 於义+ - so that exists on the set on which the 

difference is not of the form +oo — oo up to a i^-null event. This 
generalized c.exp. exists a.s. if the event [X ^ 0] and hence [X < 0] 
belong to (B (and, in particular, if (B = ft), for then X = 0 

a.s. If ffi = Ct, then E Q X+ - E a X^ = Z+ — 义一 =Z a.s., whatever 
be the r.v. X. 


27.3. Conditional expectation given a function. We connect now the 
foregoing definition with the usual definition of c.exp., but we do not 
assume, as usually done, that the c.exp.’s are restricted to those of 
integrable r.v/s. 

Let F be a function on (Q, % P) to a measurable space (以， ft’) and let 
C a and (B’f C a’ be the <r-fields induced by F on and 12’ respec¬ 
tively: (BV is the <r-field of all sets of Q! whose inverse images under Y are 
events (C ft) and (Bk is the (r-field of these events- Let Py and P f y be 
the probabilities induced by Y on (Ry and (B’r，respectively, defined by 

P y B = PB BC(Rv ； P ， vB f = PB y B f C B = Y^ l {B , ). 


(If Y is measurable, then (BV = ft，. If no ft 7 is given, then we take 
Of = S(Q ; ).) 


If (g = in the definitions of the preceding subsection, then we 
replace every (Ry by Y. Thus, we write X instead of E^ Y X y and 
call it c.exp. of X given Y. The reason for this terminology is that, as 
we shall show now, E Y X is a function of the function Y. We require 
the following proposition. 


a. For every numerical measurable function g on P ； y) 


f gdP f y = r dPy y C B = Y 
Jb , Jb 

in the sense that y if one of these integrals exists, so does the other y and both 
are equal. 
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EjX for w G fi，or it is considered as a junction of Y with values E y Y X 
for F = jy, defined up to a Py or PV-equivalence, respectively. 

Notation. The following symbols are and will be used according to 
convenience: 


E Y Xor E(X I Y) or E(Y; X), E y Y X or E(X \ y) or E{y ； X )； 
P Y A or P{A j Y) or P(Y; A) y PjA or P{A | y) or P{y\ A)\ 
and similarly with y replaced by w and/or Y replaced by CB. 


*27.4. Relative conditional expectations and sufficient a-fields. The 
Radon-Nikodym theorem applies to cr-finite and /x-continuous signed 
measures on ft with <r-finite measures /x on d. Therefore, the concept 
of c.exp. continues to apply if, in what precedes, P is replaced by any 
such measure /x- But, then, we have to specify that the c.exp.’s are 
taken with respect to /x 一 they are relative c.exp. y s. To simplify, we limit 
ourselves to finite and P-continuous measures /x. (Yet, we shall see in 
the next volume that, led by physics, we may have to replace pr.’s by 
<r-finite measures and, thus, use fully the foregoing conditioning.) 

Given the pr. space (Q y Cl, P) y the measures /x are indefinite integrals 
of nonnegative r.v/s Z and we say that the relative c.exp.’s are taken 
with respect to Z. In what follows y the r.v. y s Z, with or without affixes 、 are 
nonnegative and integrable、and the /x, with the same affixes if any，are their 
indefinite integrals • 

If a r*v. X possesses an integral with respect to /x, then the c.exp. of 
X given CB with respect to Z \s a ©-measurable function, defined up to a 
^-equivalence by 


Since 


it follows that 


f {E z ^X)d^ - f Xdn y BC(S>. 


ilA = I Z dPy A ^ dy 
= j ZdP = f (£®Z) dP^ BC 

B ^ B 




(B, 


and this definition is equivalent to 

f {F^Z){E z (& X)dP (S , = f ZXdP = f (£^ZX) dP 如 5GCB, 

J B Jb 


which, in its turn, is equivalent to 

(1) F^Z-Ez^X = B^ZX a.s. 
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Ez^X is defined up to a /^-equivalence and = 0 entails = 0; 
furthermore, 

m[E^Z = 0] = f (^Z) dP^ = 0. 

J[E^Z^Q] 

Therefore, up to a /i(B-equivalence, E^X is given by 

(10 F^ZX/E^Z, 

so that 

a. Relative c.exp!s are reducible、up to an equivalence、to ratios of ordi¬ 
nary c.exp. 9 s. 

It may happen that c.exp.’s given © relative to the r.v/s of a family 
\Z t \ collapse together: there exists a r.v. Z such that, for every /, 

⑵ E z fX = E Z ^X 

in the sense that, whenever the left-hand side exists, so does the right- 
hand side, and both are equal. But these sides are determined up to 
and (/i(B)-equivalences，respectively. Thus, equality might be 
interpreted in the sense that the /^-equivalence class of E^X belongs 
to every (^)cg-equivalence class of Ez t X } s. This is certainly true as 
soon as the equality holds for an element of each class, provided every 
is /i-continuous. Then, moreover, whenever EzfX exists so does 
Ez^X. Finally, we are led to the following definition. 

Let X be “admissible” for the family \Z t \ if its integrals with respect 
to every u t exist. A sub (7-field (B of events is sufficient (with Z) for the 
family \Z t \ if there exists a Z such that every is M-continuous and, 
for every admissible X y (2) holds up to a 0^)cB-equivalence. This con¬ 
cept of sufficient sub <r-fields is slightly more general than the usual 
concept of “sufficient statistics” which plays a considerable role in 
statistics. Clearly every <r-field 户 -equivalent to a sufficient (B is suffi¬ 
cient. Thus, in what follows, we s assume that every sufficient <r-field 
is defined up to a P-equivalence. 

The basic result (originating with Neyman and put in its final form 
by Halmos and Savage — in terms of sufficient statistics) is as follows: 

A. Factorization theorem. The sub a-field (S> of events is sufficient 
for the family \Z t \ if y and only if 、 there exists a Z such that every Z t = g t Z 
a.s. and every g t is (S>-measurable; then every gt =* E^Zt/B^Z up to a 
叫 -equivalence. 
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We require two properties of c.exp/s: 1° and ©-measurable func¬ 
tions commute (25.2, 3); 2° if, for Y integrable or nonnegative, and 
for every indicator I a of events, E^YI A = 1 A a.s., then Y = Y ; 

a,s. since, for every A 〔 (X, 



A 


YdP = J YI A dP = ^{F^YI A )dP (5 , - J(^n A )dP ( 



Y f I A dP 



Y dP. 


Proof. Z t = g t Z a.s. entails M-continuity of fit and, by (1), 

B^Z t - E z fX = B^Z t X^ gt^ZX = gt^Z-Ez^X = ^ZrEz^Xz.s. 

The sets [E^Z t = 0] being (/x^cB-null, E z fX = E^X up to (M^cB-equiva- 
lence* The set [E^Z = 0] being Mos-null, it follows from 

E^Z t = g t ^Z a.s. 

that g t = E^ZJE^Z up to a /^-equivalence. 

Conversely, if, for all indicators X y every EzfX = X up to a 
(/i/)(B-equivalence, then, by (1), 

- F^Z t X = E^ZtE^ZEz^X = B^Z t ^ZX a.s. 


so that 


F^{Z t XB^Z) - F^{ZXB^Z t ) a.s” 
Z t E^Z = Z£^Z, a-s. 


and, hence, on 5 = [.E^Z > 0], 

⑶ 厶 = 

Since is M-continuous, from 


E^Z t 

l^Z 


Za.s 


* r zdP =r (£®z) dp =o, 

so that Z = 0 on except for P-null subsets, it follows that fitB c = 0; 
hence = 0 on B c except for P-null subsets. Thus (3) is trivially true 
on B e . This completes the proof. 

Underlying the concept of sufficient cr-fields with Z is the fact that 
every i^t is supposed to be /i-continuous. This alone implies that every 
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Z t = g t Z a.s. where the g t are measurable. Thus, the whole <r-field a 
of events is trivially sufficient with any such Z and, in particular, with 
Z = 1. And every sub <r-field ffi of events such that all the^ are ©-meas¬ 
urable is sufficient with such a Z; in particular, the sub <r-field © induced 
by the family j is the least fine sufficient with Z. The question 
arises whether there exists some Z, say Z 0 , such that the least fine suffi¬ 
cient (r-field with Z 0 is the least fine of all possible sufficient cr-fields for 
the family {Zt ) ― the minimal sufficient cr-field for the family \Z t ). The 
answer is in the affirmative, as follows: 

According to Chapter II: Complements and Details 23, there exists 
a Z 0 such that 

(i) fioA = 0 <=> every yi t A = 0; 
or, equivalently, 

(i 7 ) up to P-null subsets, Z 0 = 0 on <=> every == 0 on A. 

Since, on account of (i’)，every E^X common to all the equivalence 
classes E z fX belongs also to the equivalence class of E z ^X y it fol¬ 
lows that every sufficient (r-field (B with Z is also sufficient with Zq. 
Therefore, the least fine sufficient (r-field with Z 0 is the minimal one. 
On account of (i 7 )，the corresponding factorization—•every Z t =* gtZ 0 
a.s.—is such that Zq = 0 a.s. => every Z t = 0 a.s. Thus: 

B. Minimality criterion. Write every Z t in the form Z t =* gtZ 0 a.s^ 
with Z 0 such that every =« 0 a.s. ==> Zq = 0 a.s.; this is always pos¬ 
sible. Then the minimal sufficient (r-field for the family {Z t \ is the one 
induced by the family {gt}. 

§28. PROPERTIES OF CONDITIONING 

To avoid constant repetitions, it will be assumed in this and the fol¬ 
lowing section that the integrals of all functions which figure under the 
integration and c.exp. signs exist. We recall that an a.s. relation be¬ 
tween (B-measurable functions is a P^-equivalence. 

28.L Expectation properties. Loosely speaking, c.exp's have a.s. all 

properties of expectations. 

Let Xky c y and 〆 be numbers. 

1. If X 苗 c a.s. then X ^ c a.s. y and if X ^ Y y a.s. then 
^ E^Y a.s. 
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is an a.s. linear operation: E^(cX + c f X f ) = cE^X + c , E^X / a.s. 
In particular 、 

P% =1 a.s.y P®0 = 0 a 丄， P^A ^ 0 

and 

# ( £ x k I A ) = £ ^A k a.s. 

\k^i / A=a 

These properties follow at once from the definition of e.exp/s and prop¬ 
erties of integrals. 

Conditional inequalities. Upon replacing E by E^ y the c r -, Min¬ 
kowski and Holder inequalities, as well as their consequences and the in¬ 
equalities for convex junctions^ remain valid, almost surely• 

For, on account of 1, their proofs remain valid up to a /^-equivalence 
(for Holder’s inequality use also 25.2, 3), 

2. Convergence in the rru mean. If X n — X、 then 於 X n — 
B^Xforr^ 1. 

Monotone convergence. I/O ^ Xn^ Xa.s” then 0 S | E^X 

00 

a.s. In particular, P® 2^ ^ = 2^ P^Ak a.s. 

a=> i 

Fatou-Lebesgue convergence. Let Y and Z be integrable. If 
Y ^ X n a.s. or X n ^ Z a.s. y then lim inf X n ^ lim inf E^X n a.s. y 
resp.y lim sup E^X n ^ 於 lim sup X n a.s. 

a.s. 

In particular, if Y ^ X n | X a.s” or Y ^ X n ^ Z a.s. and X n — > X y 
then E^X n 

The first assertion follows by 

E\ 於 X n - F^X\ r ^ E\ ^(X n - X)\ r 

^ E(^\ X n - X\ r ) = 五 | - Z| r — 0. 

As for the monotone convergence assertion, since X n ^\ ^ X n a.s. im¬ 
plies E^Xn^x ^ E^X n a.s., it follows that F^X n | X f a.s. where X f is 
©-measurable. Therefore, the monotone convergence criterion applies 
to both sequences X n and E^X ny for every 5 C CB, 

(x f dp\ f (^x n ) dp^ - r x n dp^ [xdp = r (於 x)dp 如 

Jb Jb Jb Jb 

n 

and the assertion follows. Upon taking / 乂 * so that 

A-i 

n 

E^X n = a.s” the particular case is proved. 

A-l 
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The Fatou-Lebesgue assertion follows from that of monotone conver¬ 
gence as in the nonconditional case. 

28 , 2 , Smoothing properties. Loosely speaking, the operation is a 
(^-smoothing. 

L On every nonnull atom 5 C is constant and its value E B X 

is the average of values of X on B with respect to P. 

By definition, 5 is a nonnull atom of © if PB > 0 and B contains no 
other sets belonging to © than itself and the empty set. 

Proof. The first assertion follows from the fact that is a (B- 

measurable function defined up to a inequivalence and a ©-measurable 
function is constant on atoms of (B* Therefore, on every atom B of (B, 

E b X-PB = f (£^X) dP^ = [xdP 

Jb Jb 

and, for PB > 0, ^ 

E B X = —— {XdP. 

PB Jb 

This proves the second assertion and completes the proof. 

Thus, B^X is a ©-smoothed X y in the sense that on atoms of © which 
are not atoms of ft, X is an “averaged X yt and, on the whole, has 
“fewer values” than X. In particular, if (B is the minimal <r-field over a 
countable partition {5y} of ft, so that the Bj are atoms of (B, then, as 
is to be expected, ^ 

= E {E Bj X)I Bi a.s.; 

the right-hand side is a.s. defined since the are determined except 
for null Bj whose countable sum is necessarily null. For the “least fine” 
or “smallest” of all possible <r-fields (B C Ct, that is, for (B 0 = {0 ， 12 }， 
we obtain ^ EX a.s. The same conclusion holds for every (B 

independent of the <r-field (Rx of events induced by X: 

2. If (S> and (S>x are independent^ then E^X = EX a.s. 

For, X and Ib being independent for every 5 C 

f = f XdP - E{XIb) - EX.PB = f (EX) dP^ 

In particular, since E Y X denotes E^ Y X and independence of X and Y 
means independence of (S>x and (S>y y we have 

If X and Y are independent, then E Y X = EX E X Y * EY a.s. 
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The operation transforms ft-measurable functions (whose inte¬ 
grals exist) into ©-measurable functions (whose integrals exist); in fact, 
it transforms classes of P-equivalence into classes of P(B-equivalence. 
In particular, as is to be expected, the operation does not modify 
classes of /^equivalence，in the sense that, if X is (B-measurable, then 
E^X = X a.s«; since, then, for every 5 C CB, 

f = f xdP = r xdP^. 

Jb Jb 

More generally, and (B-measurable factors commute, as follows: 

3 . IfXis ^measurable, then E^XY = XE^Y a.s. 

The assertion holds for X = Ib ，where C since, for every 

BC(Ry 

f = (l B YdP * f (E^Y) dP^ = f (Ib^Y) dP^. 

Jq Jb J bb’ J q 

Therefore, it holds for simple functions X n : 

E^XnY ^ X n E^Y ^ 

and, by the monotone convergence theorem for c.exp/s, it holds for 
nonnegative functions 一 take 0 ^ Xi T ^ and let ^ > oo in the fore¬ 

going relation. The assertion follows, 

4. 7/ (B C (B ; , then 

E^\E^X) a.s. 

Since © C (B ; implies that P© is restriction of to (B, we have, 
for every 5 C 

dp^ - r dp^ - r xdP = r (^x) dP^ 

Jb Jb Jb 

and the left-hand side equality is proved- .. 

Since ® C (B' implies that a (B-measurable function is ©'-measurable, 

the right-hand equality follows either from 3 or directly from 

J (E^\E^X)) dP^ = J i^X)dP = £ ( 浐 X) dP &i 5C(B. 

Thus, the smoothing can be performed in steps and remains a.s. 
invariant under "finer” smoothings. 
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Together, 3 and 4 yield the 

A. Basic smoothing property. 1/ (R d (R ; and X is (S!-measurable 、 
then 

^XX f = 

In particular, denoting by E x，tV the c.exp. given the <r-field (Rx\y 
of events induced by the couple (X\ Y) y we have 

E Y XX f = E r (X / E x ^ y X) a.s. 

* 283 . Concepts of conditional independence and of chains. Under 
conditioning, the concept of independence extends as follows: 

We say that (S> x and ©2 are conditionally independent {c.ind.) given 
© if, for every B x C ®i and B 2 C 

P^B x B 2 = P 8 5 1 -P (B 52 a.s. 

If (B = Ct, then this relation becomes IbiB 2 = ^bJb 2 a.s., so that 
two sub a-fields of events are always c.ind. given the (r-field Ct of events, 
and the concept of c.ind. given ft is trivial. 

If (B = ©0 = {0, then this relation becomes PB 1 B 2 = PB 1 PB 2 
a.s. ， so that independence is c.ind. given (B 0 > the “smallest” of* all sub 
a-fields of events. 

In what follows, we drop the parentheses and commas in writing 
compound <r-fields. 

A. ©1 and © 2 are c.ind. given © i/ y and only ij, for every B 2 C CB 2 ， 

严 = P 6 % a.s.; 

the subscripts 1 and 2 can be interchanged. 

Proof. Let B x C ®i and B 2 C ® 2 be arbitrary. We have to prove 
that 

(1) J^Wb 2 = a.s. 

is equivalent to 

(2) 严 1 / 办 = ^Ib 2 a.s. 

Since, on account of smoothing properties (25.2), 

£®/b/b 2 = a.s. 

a.s., 


and 



18 


CONDITIONING 


[Sec. 28] 


it suffices to prove that (2) is equivalent to 

⑶ ^{I Bl ^I Bz ) = E^{I Bl E^I Bl ) a.s. 

Upon multiplying both sides in (2) by I Bl and performing the operation 
(3) follows. 

Conversely, (3) implies that, for every 5 C ®, 

f (I Bl ^I B2 ) dP 

or, both c.exp/s being ©©immeasurable, 

f ( 护〜 dP^ x - f (£% 2 ) dP^ v 

JBBi JBBi 

Since bounded indefinite integrals coinciding on the class of all sets 
BB\ coincide on the <r-field (BCBi, it follows that the integrands 
and Ib 2 are r equivalent, and the proof is complete. 

Upon following literally the pattern used for the investigation of the 
concept of independence, the concept of c.ind. extends to arbitrary 
families of <r-fields and hence of r.v.’s, random vectors and random 
functions, and the investigations of the case of independence (Part III) 
can be transposed to the case of c.ind. 

Furthermore, the concept of c.ind. leads to another generalization 
of that of independence, as follows : Let (B n be a sequence of sub <r-fields 
of events. The (B n are said to form a chain y or to be chained (or chain- 
dependent or Markov-dependent) if, for arbitrary integers m and n y the 
<r-fields ©i, .•• ， CBn-i, and • • • ， (B n+m are c.ind. given (B n . In 

symbols, the CB n are chained, if, for every m y n y Bk C (S>ky 

(1) P^Bx • • * 5 n 一 i5 n +i • •. B n+m 

== P^ n B\ • • • 5 n —• • • 5 n + m a.s. 

or, equivalently, on account of A ， 

(2) 严 咖 … 瓜 5 n+1 ••• Bn+m = … a.s. 


or 

⑶ 


pCBn(B n + l-*-(Bn + m j g i . . . B n 一 I = 产 召 1 … 5 n 一 1 a.S 


If n = 1, 2, • • •, is interpreted as the “time,” we can say, loosely speak¬ 
ing, that the (B n form a chain if the “past” and the “future” are a.s. 
independent when the ‘‘present’’ is given, or, equivalently, the future 
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(“past”）is a.s. independent of the given “past” (“future”), when the 
“present” is given. We shall use mostly the defining property (2). 

Let 々 i < • • • < k n ^i < k n < k n+ i < • • • < k n+m be arbitrary inte¬ 
gers and apply the operation 〜 to both sides of (2) where n 

and rt + m are replaced by k n and k n+m respectively. It follows, 
by 24.2, and upon replacing by all events whose subscripts are differ¬ 
ent from k n+ iy ••• ， k n+my that we have the seemingly more general 
property 

By • • B kn+m = 户也 n+1 … B kn+m a.s. 

Loosely speaking, whatever be the “future” it depends a.s. only upon 
the last given “past.” 

As usual, if (B n = (&x n are <r-fields of events induced by r.v.’s (ran¬ 
dom vectors, random functions) X ny we replace above (B n by X n and 
speak about the chain of r.v.’s (random vectors, random functions) X n . 

§29 - REGULAR PR. FUNCTIONS 

29 . 1 . Regularity and integration. Since c.exp.’s behave at first sight 
as integrals with respect to c.pr/s, the question arises whether c.exp/s 
can be so defined. More precisely, according to 25.1, 

1° Properties of functions P^A are almost surely those of pr. values: 

=1 a.s.> F^A ^ 0 a.s., 严沟 a. S . 

2° Properties of functions E^X are almost surely those defining 
integrals with respect to P®: 

n n 

允 ami k^mX 

0 ^ X n t X implies X n T ^^ a.s., 

= 广尤 + - a.s. 

Yet, to speak about integrals with respect to the Pj^y we have to know 
that the Pj® are pr/s for every co C 0 or, the c.exp/s being defined up 
to an equivalence, that at the least the Pj^ are pr.’s except for 广 belong¬ 
ing to some null event. Thus, we have to assume that P® is “regular.” 

A c.pr. is said to be regular if, for every / C (i，it is possible to 
select P^A within its class of equivalence in such a manner that the 
are pr/s on a except for points w belonging to a P®-null event N. 
A regular pr.f. can be said to be defined up to an equivalence, in 
the sense that if all the functions T^A are modified arbitrarily on an 
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arbitrary but fixed i^-null event, the new c.pr. is still regular. In par¬ 
ticular, a regular c.pr. P® can be selected within its equivalence class 
so that Pj^ is a pr. on d for every a? C For example, for every co 
belonging to the exceptional P®-null event N set Pj^ = Pn where Pn 
is a pr. on d. Unless otherwise stated, regular c.pr.’s will be so selected. 
In other words, 

a regular c.pr. with values P^(co\ will be a junction onQ, X d 
with the following properties: 

(i) A) is (^-measurable in w(Gfi) for every fixed A and is a 
pr. in ^(CG) for every fixed co, 

(ii) For every A 〔 Qi and 5 C 

A) dP^ = PAB. 

In the case of regular c.pr.’s the answer to the question stated at the 
beginning of this section is, as might be expected, in the affirmative. 

A. Integration theorem. If P® is a regular c.pr” then 

f XdP a a.s. 

Proof, Since all Pj^ are pr/s on Q y we can write 

P^A =^Jl A dW coCfi, 

that is, 

E^I a = P^A ^Jl A dP\ 

It follows, on account of relations 2°, that 

a.s.; 

0 ^ Xn^ X y where the X n are simple functions, implies that E^X = 
lim E^X n = limJXn^P^ = JX a.s.; 

E^X = - - dP ^ - - JP® = ^XdP^ a.s.; 



and the assertion is proved. 
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B. Basic integration property. 1 / (R (Z d a and P®, are 
regular^ then、for (Immeasurable functions X and (S/-measurable functions 



XdP^ a.s 




The iterated integrations are to be read, as usual, from right to left. 
The foregoing relation can be written explicitly as follows: except for 
co belonging to a P©-null event 


J d^)X{^)X\J) = J P®(a,; ddW、 f do^XW). 


* 29 , 2 . Decomposition of regular c.pr.’s given separable <r-fields. The 
“elementary” case investigated in 24.1 corresponds to a given cr-field CB 
generated by a countable disjoint class of events. It can then be as¬ 
sumed, without restricting the generality, that the class is a partition of 
the form Bj + Bo y where every PBj > 0 and B 0 is null but not neces¬ 
sarily empty* The corresponding “elementary” c.pr.’s can be written as 

⑴ 严 =E (尸義 + (尸 ▲， 

where every P 办 is a pr. on d defined by 

PABi 

⑵ Pb 〆 = n ， ^ C a , 

r lij 

so that Pafij 35 1, and Pb 0 is an arbitrary pr. on a which disappears 
when B 0 is empty. Thus, an “elementary” c.pr. is regular and can be 
said to be “decomposed” into a countable set of pr.’s. We intend to 
show that regular c.pr/s given separable <r-fields can be decomposed in 
an analogous manner. 

A cr-field ® is separable if it is generated by (is minimal over) a count¬ 
able class of sets. 

sl. If a afield (B is separable, then every set B is a sum of atoms 
of © such that ^ with T R. 

t€T 

Proof. Let Bj be the generators of (B and let B t be the nonempty 
distinct sets of the form where Bj = Bj or Bf. Since the set of 
fs is countable, the power of the set T of /'s is at most that of the con¬ 
tinuum, so that T can be supposed to lie in R. Since the Bt are dis- 
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joint and any o> G ^ belongs to one of them, Z B t = Q. Since (B is 

t € T 

the cr-field generated by the Bj y every B t belongs to ©，and by construc¬ 
tion contains no other sets belonging to (B than itself and the empty 
set and is not empty; further, every 5 C ® is a sum of The asser¬ 
tion is proved. 

The functions P^Ay being ©-measurable, reduce to constants on 
atoms of (B. In fact, they reduce to constants on possibly larger events, 
namely, on atoms of the cr-field (Bp C ® induced by these functions for 
A varying over d. The <r-field (Bp is generated by events of the form 
[P^A C S] where S are arbitrary Borel sets in R; and it suffices to take 
events of the form [P^A < r] where the r are positive rationals. The 
atoms of (Bp will be called -atoms and every event contained in a 
Z^-atom will be called -indecomposable; for example, atoms of ® are 
P®-in decomposable, 

A. Decomposition theorem. If is a regular c.pr. and (B contains a 
separable a-field (B / whose atoms are -indecomposable^ then there exists a 
partition = 工 B t 七 N with T (Z R and P(^N = 0 such that、except 

t C T 

on N X % 

t c T 

where the P Bi are pr:s on Ql and Ps t Bt = 1. 

Proof. Let the countable class {Bj} generate ©' C (B. The field 
generated by the Bj is a countable class and, hence, it may be assumed 

that {5y} is a field. 

Let Z 及 =ft be the partition into atoms B t C as constructed 

in a. Since, by assumption, these atoms are P^-indecomposable, the 
functions P^A{A C Q) reduce to constants on 5 卜 Since P® is 

regular, the 户迅 are pr/s on (i. It remains to show that, upon lurnping 
together some atoms Bt into a P©-null event, P 办 5， = 1 for the remain¬ 
ing ones. • • . 

For all 5 C the indicators I B being (B-measurable coincide with 

their c.exp. P^B given ® except on a Z^-null event. Let A ^； be the 
P(g-null event on which P^Bj ^ Ib^ Since the functions P^Bj do not 
vary on the atoms B^ y Nj is the sum of some Bt and the union N = [JNj 
of all those exceptional atoms is Poj-nulL Fixo> belonging to a remaining 
atom B t . Since P^B^) and 厶如 ） are values of pr.’s on ®’ 

and coincide on the generating field {5/}, it follows that Pb^ = Ib^) 
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for every B f C ffi ’； in particular, Psfit = / 及 (w) = 1， and the proof is 
concluded. 

Corollary L If is a regular c.pr” and one of the <x-fields d or (R 
or (Rp is separable、then the foregoing decomposition holds. 

Proof. Since atoms of (B and of ®p(C©) are P^-indecomposable, 
we have only to prove the assertion when (2 is separable. Thus, let 
{Aj) be the countable class which generates d; the class can be assumed 
to be a field. The countable class of events of the form [F^Aj < r], 
r rational, generates a separable <r-field (R f C (Bp C ©. It suffices to 
show that its atoms B t are / ^-indecomposable. 

The functions P^Aj reduce to constants Pb^j on atoms B t of®' and, 
for co C B h the pr.’s and Ps t on d coincide on the field There¬ 

fore, they coincide on d and, hence, for every yf C (i, the functions 
P^A reduce to constants PbA on atoms B t . The proof is terminated. 

Corollary 2. Under conditions of the decomposition theorem 

Z {E Bt X)I Bt 

t C T ^ 

where 

E Bt X = J XdP Bo tCT. 

Apply the integration theorem 26.1A. 

In the elementary case, relation (2) can be written 

Pb^ = ^ pBjW) A 

with 

PB^') = iBiW), ^ C A- 
Therefore the decomposition (1) becomes, for u C B 0i 

P®(w; A) =j* />®(w, (a f ) dP 

with 

/)®(co, (a f ) = A/co)/ 的 (co’)，co C 及 。， C % 

PBj 

and, taking P Bo = Py the integral relation holds for all w C provided 
we add / 月。 ( 0 ?) to o/). 
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In general, let m be a <r-finite measure on a. We say that a regu¬ 
lar c.pr. P® is ^continuous if there exists an (immeasurable function 
co ; ) in co ; such that, for every w C ^ and G 

， w') 

The function will be called the conditional pr. density given © with 
respect to It can and will be assumed to be nonnegative and finite. 
Furthermore ，m can and will be assumed to be a pr. on d. If ^ is finite, 
it suffices to set 

〆 =P m /<B ^ 

If ii is strictly cr-finite and 2Z ^ is a partition such that every 

iiA n < oo, it suffices to set 

从 AArjT^An、J C.Giy 

and 

= p/(co, coO* 2 n M^ n , CO Cfi, co ; C A n . 

Corollary 3. Under conditions of the decomposition theorem^ if 
P® is discontinuous^ then the decomposition is countable; more precisely 、 
the decomposition is 

Q =* 工 Bj + N\ pBj > 0、 PN = Q. 

Proof• Since m is a pr. on (2 and hence on ®, there exists only a 
countable class {5y} of non M-null atoms B t . On the other hand, if B% 
is one of the /u-null atoms then, for any co C. B ty 

1 =» Pafit = ⑨ ( ⑴， ⑴ ’) Jm(w’) = 0 

so that Bt must be empty* 

§ 30. CONDITIONAL DISTRIBUTIONS 

30.1. Definitions and restricted integration. A regular c.pr. P® 
restricted to a sub <r-field of events still has the regularity properties: 
it is ©-measurable and it is a pr. on the sub a-field to which it is restricted. 
However, the converse is not necessarily true. Thus, in the search 
for regular c.pr ： s, it will be convenient to begin by investigating the 
weaker “restricted regularity” In fact, it will prove useful to extend 
this concept to functions of a point in a measurable space (fii, a x ) with 
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points coi and measurable sets A x and of a measurable set in a meas¬ 
urable space (^ 2 ) ® 2 ) with points o ?2 and measurable sets Ai. 

We shall not hesitate to proceed to the usual abuse of language, that 
is, according to convenience and the possible degree of confusion, we 
shall speak of “the function 々 (w! ， 為 )” instead of “the function h on 
12] X (^ 2 .” We say that the function 々 (om ，is an d\-measurable pr. 
if it is (immeasurable in for every fixed 為 and is a pr. in 為 for every 
fixed coi. Observe that, whenever there exists a pr. Pi on then the 
function 

尸 12 ( 為 X 為 ） =f A 2 ) 

Jai 

determines, by the extension theorem (for measures) a pr. on the 
product-measurable space (fli X ^ 2 > X 

Let X be a family of r.v.'s on the pr. space (fl, a, P). Let dx be the 
<r-field of events induced by X y that is, the onfield of the inverse images 
[X C S] of Borel sets S in the range space 9C of X If a c.pr. P^(co y A) y 
where A varies only over Qixy is a pr. on Qix for every fixed co C we 
say that it is a conditional distribution (c • 丄 ） of X given (B. Clearly 

A function A) y where A varies over dx、is a cuL of X given (B 

i/ y and only if y 

(CDi) P^(cj y A) is a (S>-measurable pr. 

(CD 2 ) J A) = PAB 

for every A G Gix and every 5 C 

To c.pr/s A) restricted to dxy we make correspond (B-functions 

(co, S) such that, for every fixed Borel set 6" in 9C, 

(C) ^(co, S) = /^(co, [X C S]) up to a Pfsrequivalence. 

If a function S) in (C) is a pr. on the Borel sets S y we say that it 

is a mixed cJ. of X given (B. Clearly, if there exists a c.d. of X given 
©，then there exists a mixed c.d. of X given ® but the converse is not 

necessarily true. 

The importance of c.d.’s and mixed c.d/s of X is due to the fact that 
they still have the integration property of regular c.pr.’s，provided the 
integrand depends only upon X. 

A. Restricted integration theorem. Let g be a Borel function 
on the range space ^ of u family X of r 9 v. f s y such that Eg(X) exists. 
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If there exists a c.d, {mixed c.d.) of X given (B, then、except for points o> in 
a P^-null set 、 

£% ， g(X)) —-f/ (w ， d(a , )g{X{^')){ = Jq (S ( w > ^x)g(x)). 

Proof. By definition of a c.d. (mixed c.d.)> the asserted equality 
holds for indicators g = 7^. It follows, as usual, that it holds for simple, 
then for nonnegative, Borel functions, and the theorem follows. 

If Q^icoy S) is a mixed c.d. of a random vector X = (X\ y • • * ， X n ) y 
wc set 

F®(co, x) = ( — oo, x)) y x = (x u - - •, x n ) y 

and call this function a conditional distribution function (cJ.f.) of X 
given ©; it is (B-measurable in co and a d.f, in x. Thus, we can form its 
Fourier-Stieltjes transform 

/% ， ^(co, x) y u = (u ly ••♦，〜） 

where ux = u\x x H - h «n^n, and shall call this function a conditional 

characteristic function {c.ch.f.) of X given (B; it is (B-measurable in co 
and a ch.f. in u. 

Corollary. To a c.c/i.f. u) of a random vector X given (B, there 
correspond c.exp.’s E^e mX such that、for every co and every u y 

於 (co, e iuK ) = /®(co, u). 

For, we can select the c.exp/s such that, according to the theorem, the 
equality holds for every rational point u y and then use the continuity 
property of ch.f/s in passing to the limit along rational points, 

30 . 2 . Existence* The problem of existence of regular c.pr/s has been 
investigated principally by Doob who begins by solving the problem 
of c.d.’s as follows. 

a. Existence lemma. If there exists a c.d. of a family X of r.v. s 
given ®, then there exists a mixed c.d. of X given The converse is true 
when the range of X is a Borel set. 

We recall that the range of X is the set of values X{(S) as o> varies over 

a • 

Proof. We use repeatedly the correspondence relation (C). The 
direct assertion follows at once by setting, for every co d ^ an( i every 
Borel set S in the range space of X y 

^(o) y S) == [X G •S])- 
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In general, the converse is not true because a set C Ctx may be in¬ 
verse image of different Borel sets, say, S and S\ However, when the 
range of X is itself a Borel set S\ y then 

!f(co, S x c ) = [Z c ^ c ])= 严 (A 0) = 0 

except for points co of some /^-null set N\ Therefore, when there exists 
a mixed c.d. S) y that is, a ©-measurable pr” then, and S C S ; 

being in Sx c y we have 

S) = j?®K S f ) = SS\ ⑴ 

It follows that, in (C), we can select a ©-measurable pr. by setting, for 
every A C dx y 

/^(co, A) = S) } 4N 

P®(o,, S), cCN, o, 0 ^N, 

where S is any image of A. This function is an asserted c.d., and the 
proof is complete. 

A. C.d/s existence theorem. There always exists a mixed cJ. of a 
countable family X of r.v's given (B. If the range of X is a Borel set y 
then there exists a cuL of X given ®. 

Proof. On account of the existence lemma, it suffices to prove that 
there exists a mixed c.pr. We show first that a c.d.f. exists; the proof 
is based upon the fact that the countable set of rational points 
r = (n, • * r n ) of an ^-dimensional euclidean space is dense in it. 

Let X， 〆 and r, r f denote points and rational points, respectively, of 
the range space of a random vector X = (Xi y •… ， X n )- Let P(co, A) 
be a c.pr. given ® ， and, for every r, set 

^(co, r) = ^(co, [X < r]), a, C 

the right-hand sides are selected arbitrarily within their /^-equivalence 
classes and kept fixed. Let TV, with or without affixes, denote P(B-null 
sets. On account of a.s. properties of c.pr.’s，we have 

F^iojy —oo) = 0, + 00 ) = 1> ^ C 

^ 0 , 1 , r<rXN rr . 

F®(co, r) I F^{o)y r ; ) as r | r\ co C 爪〜 
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The countable union 

N 0 U U N rr , U 

r<〆 〆 

of P®-null sets is /Vnull. For every x y set 

x) = lim F% ， r)，w C M 

r t x 

F^(c y x) = F^ 0 y x\ < ， CN y co 0 €见 

For every w C Q，the function so defined is a d.f. and, by the corre¬ 
spondence theorem (for d.f/s), the relation 

(-«, x)) = (w, x) 

determines a pr. ^(co, S) in Borel sets S. 

This function is an asserted mixed c.d., provided we prove that this 
function is ©-measurable in co and that, for every S y 

S) = P®( W , [X C S}) 

up to a /^-equivalence. By construction, the assertion is true for every 
S = ( — 00 , r). Hence, on account of the a.s. properties of c.pr/s, it is 
true on the field of all finite sums of intervals S and, by monotone pas¬ 
sages to the limit, it is still true on the minimal monotone field over this 
field, that is, on the <r-field of all Borel sets S. 

Now, let X = (Xi y X 2 y …） be a countable family of r.v/s. Once 
r iy •••，〜）are selected, we can select the ^(co; ri ， … ， r n ， r n+i ) 
within the defining /^-equivalence classes so that, for every co C 

n, • • •, r ny r n+ i) ri, - - •, r n ) as r n +i °°. 

Then, for every a> C the foregoing construction yields consistent 
d.f/s, hence consistent pr/s, and, proceeding step by step with 
” = 1 ， 2， • • •，we obtain a consistent family of pr/s which, by the con¬ 
sistency theorem (for measures), determines a mixed c.d. Q^(o) y S )on 
the cr-field of all Borel sets S in the range space of X The theorem is 

proved. ， 

Sample pr. spaces. As long as we are concerned only with a given 

family of r.v/s, we can always take for pr. space, the sample pr. space 
of the family. To simplify the statements and the notations, we con¬ 
sider a countable family X = (Xi y X 2 y • • •) of r.v/s (or random vectors 

m 

or random sequences). Set Rki***k m = IX Rkp denote by an d 

i™ 1 
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(Xkx- k m the Borel sets and their afield in this real space, and let 
Pkv-k m be the distribution of X kl ... km = (X kiy **•, X k J defined by 

户灸 1 … ^ • • * t k m C ^k\ • • 

If the same affixes occur inside and outside a bracket, we shall omit 
either the inside or the outside ones, according to our convenience. 

The sample pr, space of X consists of the space R\ X /?2 X • * the 
cr-field of Borel sets in this space, and the distribution of X on this <r- 
field. We take it for our pr. space (Q y d y P). Then 

你 h A , •••) » ( A , 欠2,…） 

and the range of any X n coincides with its range space R n . Therefore, 
the existence theorem applies and, for every cr-field (B C ft, there exists 
a c,d. of any subfamily of X given (B C (i; in fact, there exists a c.d. 
of the countable family X given (B, that is, a regular c.pr. P®. Thus 

B. Regularity theorem. C.prJs in sample pr. spaces of countable 
families of r.v's can be regularized and c、d!s oj their subfamilies always 
exist. 

In the remainder of this section we take for pr. space of X ^ 
{X\ y X 2 y • * 0 its sample pr. space and can and will assume that the 
c.pr.’s given a measurable function y on 0 are expressed as functions 
of Y and are regularized- By applying repeatedly the restricted inte¬ 
gration theorem, we obtain 

b. 1/ g is a Borel junction on Ri.“ n ，such that Eg(Xx y • • •, X n ) exists， 
then 

Eg(X u … ，厶） 

^ {gdPi^.n 


JP(dXx)JP(xi ； dx^) • • • J*P(^ly • * 1 ； ^n)g(^ly # # # > x n) 


andy except for a P x -null set of points x u 

^(Xl ； g(Xu - ■ ^ Xn)) 

^ ^ • • • ， *^n) 

=« CP(x%y • • • f*• • * ， Xn—l; * * *>^n)* 
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The c.d.’s defining properties separate as follows: 
c. Property 

(CD!) P(xiy ^ 2 ) is an dx-measurable pr. on 
characterizes c.dJs of X 2 given X\、and property 

(CD 2 ) P{S x X A) = f P{dx x )P{x u S 2 ) 

JSi 

relates the distributions of X\ and (X\ y X 2 ). 

Applications. 1 0 The law of the countable family ( 不 ， •••) 
is defined by the distribution of this family which determines and is 
determined by a consistent family of distributions PS\^ PS\ 2 y •••• 
Because of the consistency requirement, this family of distributions is 
superabundant. Conditioning permits us to determine the law by means 
of a nonsuperabundant family of measurable pr.’s (that is, with no 
required relations among members). For, by applying repeatedly the 
above propositions, we find that 

The law of the countable family X\ y • • • determines and is deter¬ 
mined by a family P{S x ) y P(xi； S 2 ) y P(^i, x 2 ； 6 * 3 ) ，… of cJ.fJs. 

Clearly, we can replace c.d.’s by c.d.f/s or by c.ch.f/s. 

2 ° Let X\ y X 2 , • • • be r.v/s on their sample space (fi, d y P) with 
joint d.f.’s F kx ... km and c.d.f/s … ‘⑨ of (X kiy •••, 么 J. We can 
define conditional independence of the given (B by the property 

jy (B T? CB 77 ® 

Fn = F k] ••• F km 

for arbitrary finite subsets k Xy •• k m of subscripts. Then 



k m 一 


E(F k ? 


# _ • 




where the expectation is obtained by integrating with respect to P®. 

Conversely, any family X\ y 不 ， • • • is trivially conditionally inde¬ 
pendent — given we exclude this trivial case. 

If the r.v/s are conditionally independent with common c.d.f. F , 

then 

Fkr -k m ( x iy * * = E(F^(xi) - - - F^(x m )). 


Thus, the joint d.f/s of any m of the r-v/s do not depend upon their 
subscripts but only upon their number in* If the joint d.f*. s have this 
property, that is, for every finite subset k\ y • • •， 

^ ^k\ * * u k m y 
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we say that the r.v/s are exchangeable. This concept was introduced 
by de Finetti and his basic result, in terms of conditional independence 
and in a somewhat more precise form, is as follows. 

The concept of exchangeability is equivalent to that of conditional inde¬ 
pendence with common c.d.f. 

The second concept implying the first one, it suffices to prove the 
converse- Thus, let G m = Fk x .“k m and set, for every x ^2 R y 

1 n 

UOd = - E Axk, <x\* 

” i 亀 i 1 

Since, as 汸， 《 — 

” I 

五 ( 芒 mW — ^nW ) 2 = - (Gi(x) — X)) — 0 ， 

mn 

it follows that there exists a r.v . 专 (x) such that 五 ( 匕 00 — 芒 (x )) 2 — > 0 , 

p _ 

and hence ( n (x) ((x). Since the are bounded by 1, it follows, 

by the dominated convergence theorem and a.s. invariance under finite 
permutations of X 9 s of 5 6 ® x R) y that 

E^) - - ^ E(UXi)- 

^ P{[X, < a ，…工 < 〜]/")• 

Thus < Xu'-y X m < x m ] = 矣 ( 々 )• • ^(x m ) a.s. Finally, since the 

function f n (co, x) is a d.f. in x y it follows that the function ^(co, x) has 
a.s. the properties of a d*(. in x 1 and therefore, in the preceding relation, 
^(x) can be replaced by a c.d.f. (use, for example, the same method as in 

the proof of the c.d/s existence theorem). 

30 . 3 . Chains; the elementary case. In the case of random vectors 

the definition of chain is as follows: • 

A sequence X n of random vectors is a chain if, for every integer n, 

a c. distribution of X n +i given Xi, • * •, X n can (and will) be so selected 
that it coincides with a c. distribution of Xi+i given X n ; in symbols 
pX\r--fXn 尸务 : +1 or, equivalently, the c. distribution P(^i> •••，〜； 
S n ^) is independent of the a priori arguments 〜 … ， x n ^i. (On 
account of 27.2b, this definition entails chain^iependence as de¬ 
fined in 25.3; apply the second relation with n replaced by w + 1, «i =« 
(1 •••，”)， 勿 ** ” + 1 ， • • •，l =* ” + 历 ， and g 33 ’ 5 n .iX … x5n+w.) 

’Usua’lly，the chained random vectors have a common range-space; 
to fix the ideas, consider a chain X n of r.v/s. The terminology used 
is phenomenological. The chain is a "system" X whose state at 
“time” 《 is Xi and has for values points ^ the P^s 〒 ble” states. 

The c. distribution PUs the one-step transition pr. at time n. y 
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the classical abuse of language, it is represented by the same symbol 
as its values. This symbol will be P n,n ^ l (x; S) and is read ‘‘pr. of* 
passage from x at time n into S at time n + 1.” The very language 
used contains implicitly the assumption of chain-dependence. 

If P n,n+1 (^; S) = P(x; S) is independent of n y then the chain is said 
to be constant (in time); P(x; S) is called the transition pr. junction 
(f.) of the chain and read “pr. of passage from ^ into S in one step.” 
From a phenomenological point of view, a constant chain represents a 
“random system” whose “law of evolution” does not vary in time. 
Let Pn(S) denote the distribution of X n . Since, for every n y 

Pn+l(S) = jP n (dx)P(x ； S) y 

it follows, by induction, that, for every pair m y n of integers, 

Pm^n(S) = fP m (^)P n (x ； S) 

where 

P n (x； S) ^Jp(x； dx x ) Jp(x x ； dx 2 ) - - J 戶 (Xn 一 2 ; ^n^l)P(x n ^l； S) . 

Clearly, this relation implies and is implied by the relation 

P n ^(x； S) = Jp n (x; dx f )P p (x f ; S); n y p = 1, 2, •••• 

P n (x; S) is called the n-step transition pr. and read “pr. of passage from 
x into S in w-steps.” Upon applying 27.2b, it is easily seen that the 
«-step transition pr. is a c. distribution of X m+n given X m (m ― 1, 2, …） • 
Upon applying 27.2a and TJ2k 、we can summarize the basic prop¬ 
erties of constant chains, as follows: 

A. A function P(x; S) y of points x ^ R and Bore/ sets S a R y is the 
transition pr./. of a constant chain of r.v:s if，and only if ， it is a Bore/ 
function in x jor every fixed S and a pr. in S for every fixed x. 

The law of a constant chain of r.v.'s X ny with distributions P n (S)，is 
determined by the initial distribution P\{S) and the transition pr./. P(x; S). 
For every pair m y n of integers， 

jP m + n (S) - ^Pm{^ x )P n i x y S)y 

where the junction P n (x; S) is determined by the relation 

P n ^ p (x; S) ^ fP n (x; dx f )P p (x f ; S); n y p = 1, 2, • • •. 
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Let P(^; S) be a transition pr.f. If there exists an initial distribution 
Pi(S) such that, for every n y the consecutive distributions PnW coin¬ 
cide with the initial one, the transition pr.f. is said to possess an invariant 
distribution Pi(S) and P\(S) is said to be invariant under the transition 
pr.f. P(^; S); the chain whose law is determined by the invariant dis¬ 
tribution P\(S) and by P(x; S) is said to be stationary. In symbols, 
P\(S) is invariant under P(^; S) if, for every n y 

P X {S) = Jp x (dx)P n (x;S). 

Since, for every n y 

Pn+US) = jP n (dx)P(x ； S) y 

it suffices to require that this relation be valid for « = 1. It easily 
follows from 27.2b that, if the chain X n is stationary, then, for every 
n y the distribution of (X my … ， X m ^ n ) is independent of m. 

A transition pr.f. P{x y S) is elementary if there exists a countable parti¬ 
tion such that P(x y S k ) = P jk for all ^ G Sj ； thus it reduces to a 

transition pr. matrix and the only values of initial distributions which 
matter are of the form Pj = Pi(Sj). We set Pj = ^n(^y) and Pjk = 
P n (^ y ^k) for x C Sj and the basic properties of constant chains become 

n ^o, zn = i, n^ n = i: 

k h 

P ； ^0, ZPj = 1> pm-h« = Z P?H Pnr 

一 i 办 

Exponential convergence. The basic limits problem for constant 
chains is that of the asymptotic behavior of w-step transition pr.f. s 
P n (x,S). A particularly simple yet a cornerstone case, which in essence 
goes back to Markov, is the exponential convergence case: there exists a 
set function P{S) and positive constants a y b such that, for « sufficiently 
large, |P n (^r, <9) — P(S) | S 似一 6n whatever be ^ and S. This implies 

at once that P(S) is a pr. . 

In what follows we use repeatedly the fact that differences <p{S) of two 

pr ；s vanish (or S ^ R so that 2<p(S) and 2 | <p(S) | attain the same 
supremum Var ^ = J| ^f>{dy) | at a positive Hahn decomposition set 

of 0 >(S) to be denoted by H y with or without affixes. • • 

a. Invariance lemma. P(S) is invariant under transition pr.f. s and 

p nJrX {S) - P(S) I S ae^ bn whatever be P\{S). 
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For 

J|p n (x,d» - ?(^)}? m Cy,6 1 )| ^ ^\P n {x,dy) -?{dy)\ ^ 2ae~ bn 
implies that 

P(S) 4 - P n ^(S) = J {P n (x y dy)P m {y y S) P^y)^), 

and 

I Pn+i(S) - P(S) I < J P x {dx) I P n (x } S) - P(S) I ^ “e- bn . 

We introduce now a ‘‘measure’’ of chain dependence which originated 
with Markov. Let A ni n+m - sup sup|? n (Ar, *y) — P n+m (^, ^)}- The 

: r,y S 

(generalized) Markov measure is A n = A nin * Clearly 0 ^ A n ^ 1, and 
in the independence case A n = 0 (since x y y disappear) while in the de¬ 
terministic case A n = 1 (since P n (x y S) = I(x y S)). 
b, Basic inequalities ； 

^ A n and A n+m 

For 

I P n (x y S) - P n ^ m {y y S) I ^^P m {y y dz) I P n (x y s) - P n (z y s) I ^ An 
and \(<p n (S) = P n (x y S) - P n (y y S) y then 

I …㈣ ⑻ I =|f yn{dz)P m {z y S)\ ^ ^n{Hn) SU P 作 S) 

Jffn+Hn 

+ <p n (H c n ) inf^S) 

Z 

=^) n (//){sup P m (z y S) — inf P m (2, 6*)} ^ △n^m* 

B. Exponential convergence criterion. Exponential convergence 
holds if, and only if y A h < 1 for some integer h. 

Proof. If exponential convergence holds, then 

I P n (x y S) - P n (y y ^)1^1 P n (^, S) - P{S) I 

+ | P n (jy ~ i — ^ 

Conversely, if < 1 then, by b, as 切 ，” —' 

\P n (x y S) - P n ^ m (jy S) I ^ A n ^ Ah lnlh] ^ °y 
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hence lim P n (x y S) = P(x y S) exists and this limit is a function P{S) of 
S only (set m = 0), so that j P n (x y S) — 戶 (J) | $ A A [n/A, (let m oo). 

Let )u be a (r-finite measure. By the Lebesgue decomposition theorem 


P h (x y S) = lp n (x 9 yMdy) + P ： (x y S) 



where p n {x y y) ^ 0 and P n s (x y S) is /z - singular. 

Markov case (generalized). // inf p h {^ y y) ^ 5 > 0 for all y in 

X 

some p-positive set S y then exponential convergence holds• 

For, if // is a Hahn set for the difference of pr/s in Ah y then 

P h (x y y) - P h (y y <y ; ) ^ 1 - {P h (^y H c ) + P h (y y H)\ 

g 1 — [P h (x y H c S) + P h (y y HS)) 


^ 1 一 



n c s 


f P h (y> 

J HS 


s i — < l- 

c. Let Xu X 2 y … be a sequence of chained r.v:s in the exponential con¬ 
vergence case and let Y be a r.v. bounded by c defined on 
…， If E refers to P y then 

\EY- E(Y\X n ) \ ^ 2ace^ bm . 

For, 

‘I EY- E(Y\X n ^x)\ 

> = Je(Y\ Xn +m = y)\Hdy) - P m {x,dy)) I g 2ace~ hm . 

This sequence behaves asymptotically as in the case of independence, as 

follows: . 

C. Exponential convergence theorem. In the exponential con¬ 
vergence case with chained r.v^s X\ y …， whatever be P\ (S) y 


(i) (^i) + 


jg(x)P(dx) 


for every Borelfunction gfor which the integral exists 

， g(X x ) + • * • + g{X n ) 内 u 一 

(ii) the limit laws of normed sums - — - 〜，〜 

K 夕 n 

where g is a finite Borel function, are stable and independent of P\(S). 
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Proof: 

1° If Pi =P then, by a, the sequence g(X\) y g(X 2 ) y … is stationary 
and also is indecomposable since, for every invariant set C, /cW = 
P(x y C) - P n (x y C) ^ P(C) y so that assertion (i) follows by the sta- 
tionarity theorem. Since the limit on and the indicator of the con¬ 
vergence set of the averages are tail functions on X\ y X^ y …， it follows 
from a that (i) holds for any P x . 

2° To prove (ii) we can take = 0 on account of the convergence of 
types theorem. Thus, let £(S n /l> n ) £>(X) with ch.f. /， where 

n 

S n — J2s(^k)* Then, by the same theorem, upon excluding the 

A ； awl 

trivial case of degenerate limit laws, b n /b n ^ x 一 1 so that, given positive 
constants, c y c\ there exists a sequence m = m{n) such that b m /b n 
—> c’Ic 、 

Let Pi = P. Then, by a, in 


(1) cS n /b n + c(S n ^p — S n )/i> n + (OS^+n+p - S n ^p)/i>n == Mm+n+p/ 々 n ， 

the law of the middle term is £(〜/〜）—£ ⑼ for every fixed p. Thus 
we can and do select p = p{n) | oo such that, for these p ，£((〜+， 一 
S n )/b n ) £(0) and hence, in passing to limit laws, we neglect the 
corresponding term in (1) while the “distance” p between <S* n and 
S m+n ^ p — S n+P increases indefinitely. But, by a and c, 

E(txp{iuc(S m ^ n ^p — S n ^ p )/b n ) I S n ) 

= 五 (exp{/«r(<y 斜 1+p — Si^ p )/^n\ I ^l) 

=£(exp{/«fOy m +i+ P 一 Sx^ p )/b n } + o(l) 


= E{txx>{iuc{b m /b n ){S m /br ^)}) —/(〆 《)， 


so that the ch.f. £(exp 卜 • 《 r5 n / 々 n}fi(exp{i«r(6 l m 十 n +p — S n ^ p )/b n \ | S n )) 
of the sum of the extreme terms in the left side of (1) converges to 
f{cu)f{c f u). It follows, by the convergence of types theorem, that there 
exists a constant c ,f such that the ch.f. of cS m ^ n ^. p /b n converges to 
f{c n u) - f(cu)f(c f u) so that the limit law is stable. 

Let Pi 〆 A For every fixed k y we can replace by S n ^ k ― s n in 
lim I 及 （ expfWn/W) - £(exp{/^n/^n} I ^i) I so that, by c, this ex- 

pression is bounded by 2ae^ bk — 0 as 々 —Therefore, the limit 
ch.f. given X y reduces to the limit ch.f. under P y so does its expectation, 

and the proof is terminated 

COMPLEMENTS AND DETAILS 


L Let (B be the afield in Q = [0, 1] of Borei sets 5 with or without affixes 
and let X be the Lebesgue measure on (B. Let C C Q be a set of ou ^ er . e 
measure 1 and inner Lebesgue measure 0. Take for pr. space (^> )y 
Sis the afield of all sets of the form A ^ B,C + B^O and PA^ \\B, + \\B,. 
Then PB = Xfi, PC - and there is no regular c.pr. given 此 


Chapter IX 


FROM INDEPENDENCE TO DEPENDENCE 


The problems in and the methods developed for the independence 
case can be transposed to the general case. This permits us to enlarge 
the domains of validity of the results obtained in the independence case 
and also to realize the range of the methods. 

In the last section of this chapter appears a different method ― of in¬ 
definite expectations 一 which leads to more general results for a.s. con¬ 
vergence and is used extensively in the next chapter. 

§31. CENTRAL ASYMPTOTIC PROBLEM 
The Central Limit Problem is concerned with convergence of se- 

k n 

quences £(X n ) of laws of sums X n — J2 ^nk of nv/s. In order to in- 

vestigate this problem in the case of dependent summands, we have to 
extend it to a Central Asymptotic Problem concerned with the compari¬ 
son of the asymptotic behaviors (as ” — oo) of* £(D and of suitably 
chosen laws £(Y n ). In fact, already in the case of independent sym- 
mands, the investigation of the Central Limit Problem was based upon 
the comparison of laws of sums with suitably chosen infinitely decom¬ 
posable laws. 

The tools we shall require are, naturally enough, extensions of those 
used in the Central Limit Problem for independent summands. We 
write 

H = F — G，h = f — g 、/ — #， 

with the same affixes (if any) throughout, for differences of d.f.'s F y G 

and corresponding ch.f.'s/, g and integral ch.f.’s /， £• 
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31.1. Comparison of laws. In what follows we state properties which 
either result at once from those of d,f.’s and ch.f/s or are obtained by 
means of identical arguments. 

L Every function H = F — G is bounded by 1 {in absolute value)、is 
continuous from the left、has a countable discontinuity set y and 

//(x + 0) = F(x + 0) — G(x + 0), 

= VarF-Var G y Var// = f|^//| g 2. 


We write H n 


w 


{gdHn — 


for all g C Co — the family of continuous functions on R vanishing at in¬ 
finity. Note that in the case of d.f/s, by the weak convergence criterion, 
this convergence is their weak convergence. 

The weak compactness theorem is valid for functions H: every sequence 
H n is weakly compact. 


We write H n 一 y H when H n 一 ^ H and^J* dH n 一 ► 


The Helly-Bray theorem is not valid for functions H. 

Its proof breaks down the moment we use convergence of variations, 

since f 册 n — does not entail J\dH n \ 

II. The functions h and h are defined by 


dH 


h{u) = J\ tux dfl(x) ， h{u) 





u 


h{v) dv 


o 


f 


e 


xux 


dH{x). 


tx 


h on R is continuous and bounded by 2 but the relation \ h \ ^ h(0) is 
not valid. 

The inversion formula is valid: 

1 广 + 〜 一 — 之 一 iub 

H(a) — H{b) = lim — I —- - : - A(«) du. 

u ^ 27r J—u 一 tu 


The weak convergence criterion is valid: H n — H up to additive con¬ 
stants if, and only if, h n A. 

The continuity theorein is valid: if h n — ► k continuous at « = 0, then 

H n H up to additive constants and h = k ， 

However, the converse is not vulid y for the proof given for d.f. s breaks 
down when the Helly-Bray theorem — which is no longer valid—is to be 
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applied; for example, if h n {u) = e 一 1 nu — e^ xnu y then the sequence h n does 
not converge on R y while H n (x) = 1 for x < n and =0 for | a; | > n 

C 

so that H n > L 

III. Expansion of h. If S is a Borel set and 0 < 5 ^ 1 then ， pro¬ 
vided the integrals exist) 


h(u) I ^ 



+ c mB \u\ m+6 i \x\ m+6 \dH\ 

Js 




u ^ 


y! 



+ Cly\ U ^ \ 1+ ^ dH 


where，if l ^ \ y then 0 < 7 ^ 1 , and、if I = 0 y then 0 ^ y ^ the c's 
depend only on their subscripts. 

If the right side is infinite, the inequality is trivially true. If the right 
side is finite, it follows from 


h(u) e iux dH{x) = J dH + J (e iux — dH + - 1 ) dH' 

use limited expansions of e xux of order / + 7 and 切 + 5; and for / + 7 
= 0 use 丨^ 5 — 1 丨 S 2. 

We can now proceed to the comparison of sequences of laws. Two 
sequences £(X n ) and £(Y n ) are said to be weakly equivalent ， and we 

write Ju(X n ) ^ £(Y n ) y if the two sequences have the same weak limit 

W 

laws for same subsequences of subscripts; in other words, if £(^0 ^ <£, 

W 4 yff 

then £(Yn，) — £ and conversely. We observe that JCd') — £ 

W 

means that F n ， — F up to additive constants. We define complete 

equivalence £(D 〜 £(y n ) by replacing in what precedes “weakly” 
by ‘‘completely.’’ 

In what follows we use repeatedly properties I and II without fur¬ 
ther comment. 

W 

A. Weak equivalence criterion. £(X n ) ^£(Y n ) i/ y and only if 、 
Fn 一 G n — ^ Q up to additive constants orfn — in — 0. 

Proof. It suffices to consider F n ― G n ; the assertion with / n — 立 n 
follows. 

W . 

Let — G n — 0 up to additive constants. The weakly compact 

W 

sequence F n contains subsequences F n ，F to which correspond 
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subsequences G n f = F n ，一 (F n ， 一 G n ') — F up to additive con¬ 
stants. It follows that <C(D 二 £(Y n ). 

W 

Conversely, let £(D 〜 Ji(Y n ). The weakly compact sequence 

W 

F n — G n contains subsequences F n ， 一 G n ' — some H and the weakly 

W 

compact sequence F n ，contains subsequences F n " — some F. By 

W 

hypothesis — F up to additive constants and, hence, F n " — 

W 

—H=^F—F=Q up to additive constants. It follows that the 

W 

weakly compact sequence — G n — 0 up to additive constants. 
The proof is concluded. 

B. Complete equivalence criterion. Let the sequences £(X n ) or 
<C(y n ) be completely compact. Then £(D 上 £,(Y n ) i/ y and only if y 

C 

Fn — G n — Q up to additive constants orf n — g n — 0. 

Proof, Since f n — gn — 0 implies that — G n — 0 up to addi¬ 
tive constants, it suffices to prove the <l i(" assertion with F n — G n and 
the “only if 1 ’ assertion with / n — g n . 

C 

If F n — G n — 0 up to additive constants, then to every completely 

C 

convergent subsequence F n ，— some F there corresponds the subse- 

Q 

quence G n / — i 7 up to additive constants, and conversely. It follows 
that JB(Z n )^JB(y n ). 

If £(D 〜 £,(Y n )y then one of the sequences f n or g n being completely 
compact in the sense of convergence to continuous functions, the same 
is true of both sequences and, hence, of the sequence/ n — g n . Iff n ， 一 
gn f h y then the sequence f n ， contains a subsequence f n ，， — some / 
and, by hypothesis, gn" — J- Therefore, / n " 一 gn” — ^ = 0 — 
unique limit element of the completely compact sequence/ n ^ g n * It 
follows that /n —《n — 0, and the proof is concluded. 

Remark. In the proof of the “if” assertion we made use of the com¬ 
plete compactness of £(X n ) only to assert that F n ，— some F. Let 
us make the natural convention that, when neither of the sequences 

£(X n ) and JE(y n ) has a complete limit element, then £(D ~ £(Y n ). 

Thus , 八 一 G n 二 0 up to additive constants implies that if the se¬ 
quence £(D has no complete limit element the same is true of the 
sequence £(y n ), and conversely. In other words, with the foregoing 
convention the assumption of complete compactness is unnecessary for 
the “iP assertion: 
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If F n — G n — Q up to additive constants，or / n 一 gn — 0 ， then 
£(X n ) - £(Y n ). 


We shall frequently center the X n at some suitably chosen conditional 
expectations We observe that 

Corollary. If 专 n ' 0 , then £(X n - $ n ) 上 £( X n ). 

This follows from the law-equivalence lemma. 

31.2. Comparison of summands. Let 

义 n = Z X nky F n = L Ynky ^ •••，&• 

k k 

Z n k = X n 0 H - H X n% k—\ + Yn t k^l H - h Y nfkn + U 

J^nO = ，左 n +i = 0. 

To X and Y with or without affixes there correspond their d.f.'s F and 
G and ch.f/s / and g with same affixes if any; primes will denote condi- 
tioning by Z n k y unless otherwise stated; for example, 

F nk = P[X nk < x I Z nk ] y f nk (u) = EV uXnk = E(e iuXnk \ Z nk ). 

For every fixed value of Z n k y the selected conditional d.f/s and ch.f.’s 
have all the properties of d.f.'s and ch.f/s, and all properties of differ¬ 
ences H = F — G，h 二 f 一 g given in the preceding subsection are 
valid for the conditional differences H f = F f ― G\ h! = f — g’• 


We intend to compare the sequences £(D and £,( Y n ) through the 
summands X n k and Y n k* (Let us observe that it is frequently con¬ 
venient to compare suitably selected partial sums, each partial sum to 
be considered as a single summand.) We are at liberty to introduce 
any suitable dependence between the sets \X n k) and {Y n k)y provided 
the laws of each of these sets are not modified and, in fact, provided 
the sequences £(X n ) and JB(Yn) remain the same. 

A« Comparison theorem. £( X n ) ~ £( Y n ) 

if Z E\f nk 一 “ I — 0 

k 


or i/ y S being a Borel set fixed or not {depending on n and/or k or not) y 


0) 


(H) 


dF f nk — dG f nk ) 


Z | /+ 1 dF' nk - dG 


0 ， j ^ /j 


nk ^ 0, 
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(iii) Z 五 f 〜 F， nk — dG\ k ) — 0 ， 卜丨 +1， …， m> 

k 

(iv) T.e{\x |-+ 6 | dF， nk — dG\ k 1—0，0 < 5 ^ 1 fixed. 

k Js 

If l — 0 condition (i) disappears and 0 ^ 7 ^ 1 ;i//^l, then 0 < 7 

^ 1. 

Proof. The first assertion follows by the complete convergence cri¬ 
terion from the inequality 

\/n ^ gn\ ^ E \fnk ^ /nk\ 

k 

given by 

iu 〉 Xnk I , . y, . v % 7 \ 

E(e k — e k ) I = I £ I ： (e tuXnk - e tuYnk )e xuZnk 

k 

SZE\ E， (/ uXnk —|. 

k 

The second assertion follows then from the expansion 28.1 III, and the 
theorem is proved • 

It is important to observe that the theorem, and hence all which fol¬ 
lows, remain valid with “finer” conditioning than by Z n ^ In other 
words, we can condition by any collection of X n kS and Y n kS of which 
Znk is a function. In particular, we can condition by the random vec¬ 
tors Z f nfc = (Xn0> • • Xn t k^U • • • ， ^n^n+l) ° r = (X n 0 + 

•••+ Xn t k^ly Yn t k^l H - H 7 认 +1). 

First approach. To X n = X n k we make correspond Y n = X^ n = 

k 

X^ n k where the summands X\k are independent, and independent 

k • 

of the X n ky and £(X\ k ) = £(X nk ). Loosely speaking, £(X\) is ob¬ 
tained from £(X n ) by suppressing the dependence between summands. 

If £(X n ) ^ £(X\)y we say that the summands X n k are asymptotically 
independent. The foregoing equivalence and comparison theorems yield 
conditions for asymptotic independence upon replacing g f by / and G f 
by F. We can thus transform the results of the investigation of the 
Central Limit Problem in the case of independence. Furthermore, we 
use the conditioning by the vector Z n n k- It is easily seen that, because 
of the independence assumption, it reduces to conditioning by X n o + 
- h X nt k^i* As an example, let us give a first extension of Liapounov's 

theorem • 





[Sec. 31] 


FROM INDEPENDENCE TO DEPENDENCE 


43 


Let Xnk = Xk/sn with EXk = 0, s n 2 = JZ k ^ n. The condition¬ 
ing is by X\ + • • • + Xk—v k 

B. Under Liapounov’s condition 

e e\ x k r* o, 

^ 丄 E 五 I EX k I -> 0 and -- E E\ E'XC- - EX^ \ 0, 

^ fl fc ^7% ^ 

thctl <£(]^Z *^n) ^9t(0j 1). 

It suffices to apply the comparison theorem with / = 0, w = 2, and 
6 1 = /? to Xnk and Y n k = X^ nky use the inequality 

Ef\x\ 2+6 \ dF， nJc — dF nk I ^ EE f \ X nk | 2+5 + E\ X nk | 2+5 

= 2E\ X nk | 2+5 , 


and apply Liapounov’s theorem to the X\k* 

Second approach. We can obtain directly results which even in the 
case of independence are more general than those we obtained for the 
Central Limit Problem (since they pertain to the more general Central 
Asymptotic Problem): 

For every fixed n y the comparison summands Y n k are selected so that 

— the Y n k ⑽ independent and £(1^) belongs to the family of limit laws 
we seek to obtain 

一 the sets [Y n k] 级 nd {X n k) are independent. 

As an example, let us prove a Lindeberg type of normal convergence. 
Let the X n k be centered at their conditional expectations so that EX n k 
== EE f X n k - 0, and set a n k 2 = EX n k y <x f n k — E f X n k. 

C. Under Lindeberg’s condition: 


0) 

(ii) 

if 

(Hi) 


I a; 2 dF n k — 0 for every € > 0, and 

k 

<^n = Z <^nk ^ <r 2 < oo for every n } 

k 


[五 I Cr f n k 2 — <^nk | ^ 

k 


then £(X n ) 


91(0, <r n 2 ). 
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Proof. Since, by (i), as « — oo and then € — 0 ， 

max a nk 2 ^ € 2 + f x 2 dF n k —> 0. 
it follows, by (ii), that 

⑴ H <^nk 3 S <^ 2 max <x nk —» 0. 

k 

Take the summands Y n k to be mutually independent and normal 
沉 (0, ^nA； 2 )> and take *5* = ( — €，+€)，/ = 7 = 1, w = 2, 5 = 1. The com¬ 
parison conditions (or j = 1 and 2 are fulfilled, the first because of the 
centering and the second because of (iii) and because the condition for 
/ + 7 is fulfilled by (i) and (1) since 

zef x 2 l dF nk - dG f nk I ^ 2：r dF nk + re- 1 E <^n A 3 —()• 

k Jfc •/! * !^e k 

Finally, the condition (or m ^ 8 is fulfilled by (ii) and (1), since 

[ 五 f x 2+5 | dF f n fc — dG n k I S 2 e d a 2 

k 〜 l<* 

and € > 0 is arbitrarily small. The theorem is proved. 

The reader may proceed in a similar fashion and obtain or extend 
other results of the case of independence. 

*31.3. Weighted prob. laws. The second approach outlined in the 
preceding subsection yields the same prob, laws as in the case of inde¬ 
pendence. However, as we shall see, under similar but less restrictive 
conditions, disappearance of independence brings forth not the same 
prob. laws but their “weighted averages.” The conditional law of a 
r,v, X given a sub <r-field (B of events is defined by the conditional d.f. 
F® or the conditional ch.f ./ 气 The d.f. F and the ch.fl/ are then given 

by EF^y /= Ef\ 

If the conditioning a-field (R is induced by some measurable function V 
not necessarily finite, nor even necessarily numerical, then, denoting 
by W the pr. distribution of V y we write 

Jf v ^(v), /* JrdfF(v). 

We say that is the weight function of the parameter /^and F (or /) rep¬ 
resents the weighted law over the family F° {or f°) of laws. 
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Examples 

1 0 A weighted law over the family of degenerate laws is of the form 

= J e%ua dW{a) 

where W on is a d.f. In other words, if w is the ch.f, corresponding 
to W y we simply have fw = It follows that, if the only “weighted” 
parameter is the shift-parameter, that is, the family of laws consists of 
Mu) = 户 ，) , aCRy then 

fw{u) e^f{u) dW{a) = w(u)/(u) y 

and the “weighting” over the family reduces to the composition of a 
law with that represented by /. In other words, the weighting of the 
shift-parameter alone reduces to the composition of two laws and pre¬ 
sents no new interest. 

2° The limit laws which emerged in the development of pr. theory 
are the normal, Poisson, and, more generally, the infinitely decomposa¬ 
ble laws. The corresponding weighted laws are 


weighted normal: 


fw{u) « I exp 


tua 


a 2 u 2 

~ 


dWifii, a 2 ) 


weighted Poisson: 

J ^OO 

exp [X(^ tu 一 1)] dW^S) 
o 


weighted infinitely decomposable: 


jwW) = I exp [iua + | g(x y u) d^{x)] dfV{a y 




2 




e xux - 


iux \ 1 + ^ t r _ 

- - ] - - - and the functions ^ on R 

1 + W x 2 

are nondecreasing, continuous from the left, and of bounded variation. 


where g(x y u) 


If JV degenerates at some element (a, a 2 ) or (X) or (a, ^), then we get 
back the corresponding nonweighted laws, A systematic investigation, 
with restrictions on a, say a constant (since any law is a weighted de¬ 
generate), would be of great interest. We say only a few words about the 
weighted symmetric stable laws. 
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A weighted symmetric stable is defined by 

/ •OO 

fw(M) = I exp [ — c\ u \ y /2] dW{c) y 0 < 7 ^ 2. 

r 

It is a Laplace-Stieltjes transform in | u | 7 . Hence, on account of the 
known properties of such transforms, 

a. There is a one-to-one correspondence between a weighted symmetric 
stable and the weight d.f. defined up to additive constants. In particular ， 
a weighted symmetric stable reduces to a symmetric stable if y and only if y the 
weight function is a degenerate d.f. of a r.v. 


Furthermore, if W n — F up to additive constants, then, by the ex¬ 
tended Helly-Bray lemma and the fact that we can set W n {x) = 0 for 

x 〈 0, oo 

fwnW) = 1 exp[-c\u \ y /2] dW n {c) 




exp [—c\ u | 7 /2] dW{c) ^ fwiu) 


参 


Conversely, let fw n ^ g* By the weak compactness theorem, there is 

W 

a d.f. W and a subsequence — W y so that, by what precedes, 

广 00 

g{u) = I exp [—r| u \ y /2] dW{c). But by a, g determines W up to 

- Jo 

additive constants. Hence W n — F up to additive constants and 
g =/w^ Thus 


b. The limit elements of a sequence of weighted symmetric stable are 
weighted symmetric stable with same exponent. 


Weighted stable laws appear in the case of sequences of exchangeable 
r.v/s since, by 27,2, 2 °, they are conditionally independent given a 
sub (r-field and 23,4 applies under this conditioning. In a different 
guise, weighted laws appear in the third approach where 

The conditional laws £ f (Y nk ) will be of the limit type obtained under 

similar conditions in the independence case. 

We use the following notation: 

Otnk{^) ^ I X dF’ n k ， tt’ n (€) = 23 ， 

J| * I <« k 

<r / n jfc 2 (c) = J* x 2 dF 4 nk — x dF'nk^j , </n 2 ( € ) = 22 > 


we drop the primes if is replaced by F } and drop € if € = +°°. 
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Before we attack the extension of the more general i.d. case, let us 
give, as an example, the extension of the historically important Lia- 
pounov’s theorem. 

A. Let the X n k be centered at their conditional expectations. If 

E E\ X nk | 2 + 5 ^ 0 

k 

for a b > Q y then & {X n ) ~ £(22 ^nk ) 切 ith £> f (Y n k) = 91(0, a f n k 2 )- 

k 

Proof. Take the Y n k to be conditionally normal 91(0, <r f n k 2 )y 'So that 
the law of Y n k is the weighted symmetric normal £91(0, ank 2 )y and apply 
the comparison theorem with S = R y I = 0 y m = 2 y and 5^1. The 
comparison conditions for j = 1, 2 are fulfilled, since the corresponding 
sums vanish, the first because of the centering and the second because 
of E f Y n k 2 = E’X n k 2 • The condition with m ^ 8 \s fulfilled, since 

E\ Y nk | 2+5 = EE f \ Y nk | 2+5 = cEa f nk 2+8 S cEE^ X nk | 2+5 

= cE\ X nk | 2+5 

and hence 

ZEf\x\ 2+8 \ dF nk - dG f nk \ S (I +c)ZE\ X nk | 2+5 ^ 0. 

fc J k 

The theorem is proved- 

Remark. If we add the hypothesis that the d.f. W n of a f n 2 con¬ 
verges weakly to W y then the sequence £.(X n ) converges to a weighted 
symmetric normal. This limit law is that of a r.v. if, and only if, fV n 

— W y and then it is normal if, and only if, W degenerates. Similar 
considerations apply to what follows. 

We pass now to the limit weighted i.d. laws. We require the notion 
of conditional uniform asymptotic negligibility y for short uan\ defined by 

max P'[\ x nk I g 77 】二 0 for every ” > 0. 

k 

In the case of independent summands, the uan ; condition reduces to 
the uan condition and in the general case implies it, since, by the domi¬ 
nated convergence theorem, 

max P[\ X n k I ^ S £(max P f [\ X n k | — ”]) — O ， 

k k 
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c. Under uan f condition，for every € > 0 


max I 

k J\ z\ 


f I，I 

•✓I «I <« 


dF\ 


0 for j > 0; 


(H) 

(Hi) 


max I I x — a f n k(e) | a dF’ n k 
k J\x\<t 


0 for ^ ^ 1; 


max Y nk (u) = max 厂(/—)一 1) dF nk 
k k J 


Proof • Let 0 < ?? < €• Since 

f \x\ a dF nk ^v 8 ^e 8 F[\X nk \ ^v] 

J\z\<4 

assertion (i) follows by taking max and letting n — 泊 and then rj 

k 

Assertion (ii) follows, on account of (i), from 

f 卜 - a f nk (e) | a U 2 s - 1 f I 冲 dF nk + 2 3 - ” a f nk {,) \ 

J\x\<t J\x\<t 


s 


and 


a^e) | a ^ e 8 ^\ a f nk (e) \. 


Finally, assertion (iii) follows, on account of (ii), from 


2 f dF f nk + \u\f 

J\x\^4 J\ X I 


Let 

^Ogg f nk{u) = /W n jfc(€) + 7nJfc(«)- 

d, E E\f nk (u) - \^cEE\ y'nkiu) I 2 , 


This follows (upon dropping the subscripts n y k) by | 7 r | ^ 2, hence 
- 1 y/1 ^ e 2 y from 


\f( u ) - g\u) I = I 〆 '(•)/(“)-， ,(w) 卜 11 + y\u ) -，⑻ 1 

“k (“ ) i 2 〆 ⑻ 1 “ v(«) i 2 . 

B. Under uan f y if for every n 


0) 

and 

(ii) 


^L ： 


dF nk {x) ^ r 7 < oo 


Z Ea^k 2 ^) ^ ^ < °°> 

k 
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( ^ en U 〜 Y nk ) where the summands Y nk are conditionally 
i.d, with ch./'s g nk . 

Proof. On account of (i), condition (ii) is equivalent to 


(iii) 

since 

k 




( x - a f nk (e)) 2 dF' nk {x) g c nf < oo, 


k J !x!<« 



I ^ I <« 


Cv — ^nk{^)) 2 dF r nk {x) - <r f nk 2 (e) 


=Z Ea\ k %) dF nk {x) ^ 

k ^ ^ < 

Therefore, upon substituting on ( —€， +€) the limited expansion of 
order 2 of the integrand in 7 ’ n A •⑻， 

1C 五 I 7 nk(^) I = (2 + « I) 2 f dF n k 

k ^ J\z\>t 

“ 2 r 

+ ~ Z ^ I ( 欠 一 a f nk (e ) 2 dF nk 

Z k J\x\<t 

2 


^ (2 + e\ u \)c f + — r 


so that, by c, 


2 五 I y f nk(u) | 2 ^ max I y f nkW) I H E\ (“ ) | — 0. 

k k k 

But the left-hand side sum is a number and hence converges to 0. 
Thus, by d f . 

E 五 |/nA ⑷ — g\k{u) I ^ 0, 

k 

the comparison theorem in terms of ch.f.’s applies and, hence, £(2Z X n k) 

k 

〜 ynk)y where the summands Y n k are conditionally i.d. with ch.f. 

k 

g n k and mutually independent. The theorem follows. 

Remark, In the case of independence it can be shown that, under 
uan condition, (i) and (ii) hold when the sequences £(X n ) or JE(y n ) are 

completely compact so that, then, £(X n ) 〜 £(y n ). This extends the 
Central Limit theorem. The proof is left to the reader. 

Random vectors. The extension to random vectors X n k can be ob¬ 
tained as usual either by reinterpreting the symbols used or by making 
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correspond to the random vectors X nk the r.v/s scalar products 

of X n k and of an undetermined sure vector v. 

Random number of r.v/s. Let the number v n of summands in the 

v n 

nt\\ sum [ ^nk be a r.v. Set 

k 现 "l 

pn(r) = P[v n = r], PnW == 5Z pn{r) 

s 

and denote by E r the conditional expectation, given v n = r. Assume 

that the expressions below exist and are finite — they certainly do if 
Ev n < oo. Then 

AW) — gn{u) = £(exp [iu E X nk ] - exp [iu Y, Y nk }) 

k k 

°o r r 

=E pn{r)E r (cxp [iu X ； Xnk] - exp [iu £ Y nk \). 

But, when all the expectations are conditioned by v n = r, then the com¬ 
parison theorem applies. Hence, 


r 


r 


and 


£r(exp \iu £ X nk ] ^ exp [iu Z Y nk \) ^ Z E r \f nk {u) ^ g\ k {u)\ 

1 1 h saal 


oo r 

\/n - gn \ ^ T, Pn(r){ E Er\fnk - g'nk |} 


^ Pn{^)E r }\fnk — g f nk |- 

k 篇 l k 

Write E n h for the operator pn(y)E r , The relation becomes 

k 

oo 

\/n-gn \ ^ZEnk\fnk-g f nkl 

and, hence, 


C. When the number of summands is random 、 the results obtained by 
using the comparison theorem remain valid provided ^ E is replaced by 

k 

Z Pn(r) Z Er or h Z Enk. 

r A; —1 k 

If v n is independent of the X n k and Y n k y then E r — E and hence E n k = 
P n (k)E y and it suffices to multiply F r n k and G\k by P n ( 走 ） • If, more¬ 
over, v n degenerates at k ny then P n (k) = 1 or 0, according as 々 ^ 々 n 
or 走 > k ny and, as is to be expected, we fall back on sure number k n 
of summands. 
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§32. CENTERINGS, MARTINGALES, AND A.S. CONVERGENCE 

* ^ 

, 32.1. Centerings. Conditions for a,s. convergence (and a.s. stability) 
of sums of independent r.v/s were obtained by means of centerings at 
expectations or at medians. The methods continue to apply to the 
general case, provided the centering quantities are conditioned and, 
thus, become themselves r.v/s. Furthermore, as has to be expected, 
the conditions so obtained will be sufficient but no more necessary. 
Since the proofs run parallel to those in the case of independence, we 
shall be content with essentials and shall leave the complete transcrip¬ 
tion of Chapter V to the reader. 

Centering at conditional medians. We say that a r.v. fx^X is a condi¬ 
tional median of X given (B, where (B is a sub <r-field of events, if 

P^[X - /X ^ 0] ^ ^ P®[X- /X s 0] a.s. 

When independence is not assumed, the proof of inequality 17.1C breaks 
down at the point where PA^Bk is replaced by PAkPBk* Yet, if we 
observe that PAkBk = E{I“P(Bk | } and replace medians 

fi(Sk — S n ) by conditional medians jji(Sk — | S\ y • • •， Sk) y then the 

proof remains valid. Thus 

A. Extended P. Levy inequality* If the sums Sk are centered at 
conditional medians fx(Sfc — S n | 6 * 1 , • • • ， Sk) y then 

P[max I ^ I ^ c] ^ 2P[| S n | ^ e]. 

k 

The propositions in 17.2 which result from P. Levy's inequality con¬ 
tinue to hold with similar modifications. Let us state the most im¬ 
portant one. 

p 

B. Convergence theorem. If the sequence of sums S n —^ S y then 
there exists a sequence of conditional medians of suitably selected partial 

P a.s. 

sums such that 专 n —^ 0 and S n — — > S. 

Remark, Propositions much more similar to those of the case of 
independence are obtainable by means of centerings at conditional ex¬ 
pectations and, as we shall see in the next subsection, such centerings 
provide an important dependence model — of ‘‘martingales，’，which is a 
“natrural” generalization of that of consecutive sums of independent 
r.v/s centered at expectations. Yet the power of the centerings at 
medians accompanying symmetrizations in the case of independence 
leads one to think that it would be of interest to investigate in detail 
the dependence model that such centerings provide. 


FROM INDEPENDENCE TO DEPENDENCE 


[Sec. 32] 


Centering at conditional expectations. We suppose that the r-v/s X n 
are integrable so that c.exp/s = E{X n | X\ y • • • ， X n ^i) exist and 
are finite; for « = 1 the conditioning disappears and 匕 = EX x a.s. 
We have E^ n = EX n and 

I * * * > -^71 — l) ~~ ^71 3 ,*S« 

Therefore, for m < n y 

E\(X m - U(X n - h) I 不 ， … ， X n ^} 


so that 


and, hence, 


= - ^ rn )E{X n — Xiy • • •, X n ^i) = 0 a.s 

E{X m — ^ m ) (Xn 一 h) = 0 


£ E( 不一匕） - E E{X k - y 2 . 


We say that the r.v.’s X n of a sequence are centered at c .exp's given the 
predecessors y if = 0 a.s. (Such centerings were first systematically 
used by P. Levy-) Thus 

a. Extended Bienayme equality. If the r.v's X n of a sequence are 
centered at c.expJs given the predecessors^ then they are centered at exp's 
and 

n 

<r 2 S n - E d 

Asasl 

In fact, more is true. If 专 n = 0 a*s. and A n ^\ C ®PGi，• • • ， ^n^i) is an 
event defined in terms of X\ y …， then 

E(S n ^I An ^ I 不 ，…， = S n ^I An ^ E(X n I 不，…， 


and, hence, 


0 a.s. 


E(S n ^I An ^ Xn) 


Because of this orthogonality property, the proof of the right-hand side 
of Kolmogorov’s inequality remains valid word for word. Thus 

C. Extended Kolmogorov inequality. Ij the r.v. 9 s Xk y 走 = 1， • • • 
»， are centered at c.exp.’s given the predecessors^ then 


P[max I I ^ c] ^ S db 

kSn 6 
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The propositions in 16.3 which result from Kolmogorov’s inequality 
and those in 16-4 hold with similar modifications. Let us state the most 
important ones. 


D. Convergence theorem. If the series 23 <r 2 X n converges and the 
series ^2 converges a.s. y then the series ^2 converges a.s. 

More generally, if for some positive constant c the series ^ 户 [| X n \^ c] 
and J2 E\X n c — E{X n c j X u • • •, X n ^i)} 2 converge and the series 
^ E(X n c I X\ y • • • ， X n ^i) converges a.s. y then the series ^2 converges 


a.s 


a 2 X n 

E. Stability theorem. If ^ 00 T °°> ^ en 

bn 


E(X k \X ly ^- y X k ^)} 


Let Xbe a r，th and let x vary on [0, +oo). If E\X | r < », r < 2, P[\ X n 

P{\X n \ … ， SP{\X 

不 ， …， u a.s. y according as r 9 ^ \ or r ^ 1, then 



n a.s. 

Z ( X k 一 仉）一 > 0 


with rjk — 0 or E{Xk | -^ 1 , • • •， 不 一 1 ) according as 0 < r < l or 1 S r 

< 2 . 


Let the r.v/s X n of a sequence be centered at c.exp/s given the pred¬ 
ecessors- Since = Xu • • •, Sn = 不 H - h X n determine and are 

determined by X\ y • • • ， X ny it follows that 

•••, S n ^) = E(S n ^ + •••, Sn^) 

=<S* n _i + E(X n I X\y • * 一 1 ) = 一 1 a.s. 

This property of the sequence S n is called a “martingale” property. 
Conversely, if a sequence S n has the martingale property, then setting 
X n ^ S n ^ 一 i(So = 0)，we have 

E{X n j Xiy * • -STn—l) ^ 五 - 一 1 ^*1 ， • • •， ^n—l) 

== S n ^i 一 *S* n 一 i = 0 a.s. 

Thus, the martingale property characterizes consecutive sums of r.v/s 
centered at c.exp/s given the predecessors- Since we are interested in 
a.s. properties of such sums, it is “natural” to investigate them directly 
without writing them as sums. 
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32.2. Martingales : generalities. A possible interpretation of a ‘‘fair 
game” is as follows: Let X t represent: the debt or fortune of a gambler 
at time s. The game is fair if the gambler’s expected fortune at time /, 
given the past up to the time s < equals his fortune at time s. To 
this interpretation corresponds the concept of martingale. It has been 
introduced and investigated in the form of consecutive sums by P, Levy, 
then studied by Ville and systematically explored by Doob 一 to whom 
most of the results are due — and, finally, extended to ^advantageous 
games” or submartingales by Doob and, in a different formulation, 
by Andersen and Jessen. 

In this section we assume that, unless otherwise stated, the expecta¬ 
tions of the r.v. 9 s under consideration exist ， and denote by 

(B n *= * * 9 y ^n)y = ® (-Xn> ^n+l> • • •)， 

(B = 6i == (B ( 义 1 ，义 2 ， • • • )，© = p| 

the sub <r-fields of events induced by the families of r.v/s (X\ y • • • ， X n ) 9 
(Xny Xi+ 1 , … ）， (-^ 1 , X 2 y •••) and the tail of the sequence [X n } y 
respectively. 

Definitions. Let [ X tJ t C . T } be a family of r.v/s on a set T 
ordered by the relation “•<，’’ and let (B^ ― (&{Xtf y t r -< /} be the sub 
cr-field of events induced by the subfamily of all the X t t with t f -< /. 
The family is said to be a martingale if, for every pair s < “ 

X 3 = E^Xt a.s., equivalently, | -ST* = | X ty B s C 

The martingale is said to be closed on the left or on the right according 
as it has a first or a last member (it may have neither or both). 

If in the foregoing definitions “ = is replaced by the family is 

said to be a submartingale. If the inequality sign is reversed, it is a 
supermartingale. Changing the X t into —Xt interchanges “sub” and 

“super.” 

Note that the X t being r.v/s, the above c.exp/s are a.s. finite for martin- 
gales while their negative parts are a.s. finite for submartingales. 

We intend to investigate submartingales { X ny n = 1,2 ， •••}• The 
subscripts are ordered either by the relation and then we have a 

submartingale sequence Xu 又 2 ， * • • (closed on the left by 不 )， or by 
the relation “-，，and then we have a submartingale reversed sequence 
• • • Z 2 , Xi (closed on the right by Xi). Because of the basic smooth¬ 
ing property of c.exp/s the foregoing definitions reduce as follows. The 

r.v/s X ny « = 1 ，2 ， • • • form 
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a martingale sequence ^ if X n = E(X n ^\ X\ y • • •， X n ) a.s” 

“ closed {by X) martingale sequence, if X n = E(X n+l | X u X n ) y 
X n = E{X\X u ^^X n ) a ： s., 

a martingale reversed sequence^ if X n ^\ = E{X n | X n +u ^n+ 2 ， •…) a.s. 
a closed {by X) martingale reversed sequence y if X n ^i - E(X n \ 

JVn+ 2 , … ， X) a.s” ^ = E(X n I X) a,s- 

For example, in the first case, (B m C (B m +i C • • • C ® n —i for m < n 
and, by the basic smoothing property, we have 

… = X m a.s.; 

similarly in the other cases. 

If, in the foregoing relations, “ = is replaced by martingales 

become submartingales. s. 

Examples 

n 

1。 Let X n = X Yky ” = 1 ， 2 ， •••• If the r.v/s Yk are independent 

with EYk = 0, or dependent with E(Yk | Y\ y … ， Yk^i) = 0 a.s. for 
走 > 1 ， then, according to 29.1, the X n form a martingale sequence. 

2° Let diy d 2 y • • • be a sequence of sub cr-fields of events and let 
every X n be (i n -measurable. 

If (ii C (^2 C • • • and every X n = E an X n ^.\ a.s., then the X n form 
a martingale sequence, since (B n C ： Q n and, by the smoothing property, 

- F^E an X n ^ - X n a.s. 

Similarly, if (2i C (22 Cl • * • and every X n = E an X y then the X n 
form a martingale sequence closed on the right by X. For example, 
for any r.v. X and random sequence Y ny the sequence E(X | Y\ y … ， Y n ) 
is such a martingale. 

Similarly, if ] • • • an d every X n = E an X a.s-, then the X n 

form a martingale reversed sequence. For example, for any r.v. X and 
random sequence Y ny the reversed sequence •• • E(X | Y ny Fn+i ， • • •) 
•. • E(X I V 2y Y Sy • . •) ， E(X I Y u Y 2y •••) is a martingale. 

Decomposition of sub martingales . To simplify, we assume that the 
r,v/s below are integrable, and leave to the reader the discussion of the 
case when their expectations exist but are not necessarily finite. 

1° Let Xu X 2y - - - be a sequence of nv/s and set X f \ = 0 

X n = X r n X n ny X^ n = 23 {£(-Xa ： I Xk^l) }• 

友 ss2 




56 


FROM INDEPENDENCE TO DEPENDENCE 


【 Sec. 32] 


It follows that 
and hence 


E(X f n+i I Xu • • X n ) = X f n 

£(^ n+1 | •••，义、 = 


Thus the sequence X\ y X^y • • • is a martingale. In particular, if the 
sequence X iy X 2y • - - is a submartingale, then every summand in X" n 
is a.s. nonnegative. Therefore, a submartingale sequence X n is decom¬ 
posable into a martingale sequence X f n and an a.s* nonnegative and non¬ 
decreasing sequence X n n \ more precisely, 


尤 n = + X，， ni E(r n+1 1 不，…， JJQ = a.s” 0 s X" n T a.s. 


and, hence, 


EX n ^ EX f n + EX\ y E\X f n \^E\X n \ + EX\ y 0^ EX\ T . 

Let sup £*[ X n I < oo. Then, it follows that X’ n and X\ are integrable 
and sup E\ X f n | < oo, sup EX\ < oo. Thus 0 ^ JT 、 丁 X" a.s. finite, 
and the study of the convergence of the submartingale sequence reduces 
to that of the martingale sequence X^n with sup E\ X f n | < oo. More¬ 
over, the limits, if any, differ by an integrable r.v, 

2° Similarly, let the reversed sequence • • • 不， X\ be a submar. 
tingale with EX\ finite and set 

00 

+ 义 "n, 义 "n = E {E(X k \ X k+ly X k+2i … ）— 不 +1 } • 

ksafl 

The summands of the infinite sum are a*s. nonnegative, so that a.s. 
0 ^ X^ni with EX f \ = EXi - lim EX n (the limit exists since, 
clearly, EX\ ^ EX 2 $.••)• Let lim EX n > 一 》 so that EX f \ is 
finite. Then X f \ is a.s. finite, 0 S X ff n ], 0 a.s., and the study of the 
convergence of the submartingale reversed sequence reduces to that of 
the martingale reversed sequence •• • X f 2 y with E\ X\ | < °°; more¬ 
over, the limit, if any, is the same. 

The interpretation of a martingale as a sequence of fortunes of a 
gambler raises the question whether in the long run (n —> 00 ) his for¬ 
tune was or becomes stabilized, that is，whether there is convergence — 
in some sense. To answer the question we require a few inequalities. 

a. Let g be convex and continuous on R with ^(+00) = +»• If EX 
exists and > —« a.s n then g(E^X) ^ E^g(X) a.s. 

For, if < 00 a.s* the conditional convexity inequality applies; other¬ 
wise apply it to X n = XI[ X <n\ + n hx^n] and let ” — 
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A. Submartingale inequalities. Let the r.v.'s Xj form a countable 
submartingale. Then 

(i) the r.v.’s X 广 form a submartingale; and ij the Xj ^ 0 a.s. or the 
Xj form a martingale ， then y for every r ^ 1, the Xj | r form a submar¬ 
tingale. 

(ii) if the r.v. Y closes on the right the submartingale y then、for every c >0 y 

cP[sup Xy >c)^f y ； 

•/[sup Xf>c] 

and if the Xj ^ 0 a.s. or the Xjform a martingale then、for every r ^ 1, 


c r P[sup \X j \>c)sf 

•/ [sup 


_ lylr . 


Proof, (i) follows from a by taking, respectively, = x + y or 
^(.v) = 0 for ^ < 0 and g(x) = x r for ^ ^ 0, or = | x j r . 

To prove (ii), set Aj = [Xj > c , the predecessors ^ c ], so that B = 
[sup Xj > c] = I Aj and, since Y closes the submartingale, 


f y — f y ^ H f E(Y I Xj and the predecessors) 

Jb J Aj Jaj 

Xi^TcPA^cPB, 

^ Aj 

so that, the first inequality is proved and the second follows on account 

of (i). . • . 

32.3. Martingales : convergence and closure. The limit properties of 
submartingales are summarized in the convergence theorem below* 
The proof is based on an ingenious inequality due to Doob. 

Let x ky k — l y •••，”，be finite numbers. The number h of crossings 
from the left of the interval [a y b\ is the number of times that, starting 
with X\ and proceeding to x ny we pass from the left of the interval to its 
right. More precisely, let 


^k\ ^ a y ^ — a y ^4 — 

where k\ is the first subscript k y if any, such that x kl S then k 2 is 
the first subscript k > k Xy \( any, such that x k% ^ b y and so on. If k h 
is the last subscript so obtained，set 是 j = « + 1 for jo < j S if 
there is none, then k\ = • • •== « + 1. Thus, to every k > k\ y if 
any, there corresponds an integer j determined by the values of x Xy • • •， 
A 一 i and such that kj < k ^ 是 y+i. For 是 > 1 ， if 是 S set 4 = 0, 
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and k > k\ set 4 = 0 or 1 according as the corresponding j is odd or 
even. When 走 2 彡 ”， the number of crossings is the largest integer h 
such that k 2 h ^ n \ when 是 2 > ”， the number of crossings 々 = 0. 

Let h > 0. If 々 2 A+i S ”， then 

n 

工 ik(Xk - — A 2 ) + - (■ ( 办 ㈣ 一 x ku) S (“一 b)h. 

h saS 

If k 2 h+i > then 

n 

— A-i) = (X ki — X k2 ) +- (*■ ( x k u 一 i 一 一 2 ) + Wn — x k 2h ) 

k^sz2 

S (a — b)h + (^n ~~ a )* 

Let /t = 0, Then the left-hand sum is null, and the first inequality 
is trivially true. 

Thus, in either case 

n 

YL 4C^ 一 A-l) ^ — b)h + (^n “) +• 

k=^2 

Now, to r.v.’s X\ y … ，义 n we make correspond a r.v. H n and r.v.’s 
Ik(k > 1 ) determined by X\ y • • •, Xk-\- We define them by H n (o)) 
= h y = ik ^or x x = X x (co) y • • •, x n = X n (w), w C 口 ， where h and 

i k are the numbers introduced above* The inequality established above 
becomes 

hh(X k - Xk 一 i) b)H n + (X n - a)^ 

and, by taking* expectations assumed finite, we have 

^ f {Xk 一 Xk^i) ^ 一 b)EH n E{X n 一 從 ) +• 

If X\ y • • •, -Y n is an integrable submartingale, then every left-hand in¬ 
tegral is nonnegative and, hence, 

{b — a)EH n ^ E(X n - a) + = sup E{X k - a)+. 

k<n 

If E(X n — a) + - oo y the inequality is trivially true. If E(X n — 设)十 
< 00 , note that H n is also the number of crossings of [0, b — a]by the in¬ 
tegrable submartingale {X\ — “)+， • • •, (X n 一设 ) +• It: follows that the 
inequality is always true. 
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Similarly, but proceeding from x n to x\ instead of from X\ to x ny if 
X ny … ，义 i is a submartingale, then 

{b — a)EH n ^ E(X x -a) + ^ sup E(X k - a)^. 

k^n 

To summarize (see also 36.2) 

a. If X\ y … ， X n or X ny • X\ is a submartingale^ then 

、b — a)EH n ^ sup E(X k — a) + . 

k Sn 

We are now in a position to prove the basic 

A. Submartingales convergence theorem. Let the r.vJs X n form 
a submartingale sequence or reversed sequence. 

(i) If sup EX n + < °°, then X n X < <x> with EX ^ sup EX n + 
and E\ X\ ^ sup E\ X n \- 

(ii) X n X where r ^ 1 //, and only if 、 the X n r are uniformly in- 

a.s. 

tegrable、and then X n — > X. 

Proof. 1 ° Since 

[X n + >] = U J at b with Aa t b =■ [lini inf < ^ < ^ < lim sup X n ] y 

a t b 

where a y b vary over the denumerable set of all rationals, the divergence 
set is null if, and only if, every set A atb is null. 

We apply the foregoing lemma to an arbitrary set A a% b* Since H n | H 
= oo on AaM this set is null, provided P[H = oo] = 0; it will be so, 
whether the submartingale is a sequence or a reversed sequence, pro¬ 
vided 

EH = sup EH n ^ sup E(X n — a) + /{b 一岣 < oo. 

Therefore, sup EX n + < oo and hence sup E(X n - a) + < co y for every 
a CRy imply that X n 上％ some X finite or not, and, by the Fatou- 
Lebesgue theorem, E X\ ^ sup E X n . 

It follows that 一 ~*~> <ind.j by the same theorem，^ 
su p EX n ^ < ^o. Thus, is integrable hence a.s- finite. Therefore, 
upon modifying if necessary hence X on a null set, we can take 
to be finite so that X ^ < °°« Also, EX exists and 

EX S EX^ ^ sup EX n +. 

The first assertion is proved. 
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2° If X n — X for some r ^ 1, then, by the Zr-convergence theo¬ 
rem, the I X n | r are uniformly integrable. Conversely, let the | X n \ r 
be uniformly integrable for some r ^ L Then the E\ X n r , a fortiori 
the E\ X n j ^ E l/r \ X n | r , are uniformly bounded. Therefore, by (i), 

n r 

X n —> X and, by the Z r -convergence theorem, X n — X. The sec¬ 
ond assertion is proved. 

The foregoing convergence theorem yields 


B. Submartingales closure theorem. Let r ^ 1. 

(i) Let {X n } be a martingale or a nonnegative submartingale sequence or 

r 

reversed sequence. 1/ Y L r closes it on the right, then X n — > X. 1/ 

sup E I X n | r < oo with r > 1, then such a Y exists. 

(ii) Let {-Y n } be a {sub)martingale sequence or reversed sequence. If 

X n — X, then X n —> X and X closes on the right ， respectively 、 on the 
left the {sub)martingale; in fact、X is the nearest of the closing r.v's. 

Proof. 1。 Let Y ^ L r close \X n )* Set B n = [| X n | > r] so that 
B = [sup I X n I > r = U B n> and use 29.2A. Since c r PB ^ 

^ E\Y\ r < oo and f \ X n \ r ^ f \ Y\ r y it follows that, as r — «， 
一 j b 

PJ5 — 0, hence I | y| r 0, and the | X n | r are uniformly integrable. 

Jb 

r 

Thus A applies, and X n — > X. 

If sup£| Z n | r < oo with r > 1, then, by 9.4C, Con 2, and by A, 

X n 4 X. Since E\X\ r ^ sup£| Z n | r and, by (ii), X closes [X n ) y 
(i) is proved, provided we prove (ii); 



2° Let the assumptions of (ii) hold. Then X n ^ X implies that 
X n ^ X and also, by A, that X n ^ X. Thus we can pass to the 

limit under the integration sign, as follows: 

In the submartingale sequence case, we have, for every B n e. ®n, 


X/ n -fsn Xn 


+w. 
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and, by letting w — 00 , we obtain 



X n 


n 


-fs n 





n 


Therefore, X n ^ E^ n X a.s.，that is, the submartingale sequence is 
closed on the right by X. If a r.v. Y also closes the sequence on the 
right, that is, for every B n G ® n , 


LX /， 


then, by letting w > oo, we obtain 


L x -L 


Therefore, on every (B n and hence on |J (B n , the indefinite integral of 
y — X (it exists since X is integrable) is a or-finite measure and, by the 
extension theorem, determines a <r-finite measure on ©. Thus, the in¬ 
definite integral of Y - X on (R is nonnegative, that is, for every 

5C«, 


Jb j b j b 


Since X is equivalent to a ©-measurable function, it follows that X ^ 
a.s. and hence the submartingale 尤 1 ， 不 ， …， X is closed on the 
right by Y; that is, X is the “nearest” of the closing r.v.’s. 

Similarly, in the case of a submartingale reversed sequence, for every 

c C (B(X) (C e since X is ©-measurable), as w — 00 ， 

f X ^ f Xn-bm = ^ = T E(X n | X) y 

Jc Jc ^ c 

so that JSf is a closing r.v. on the left and if Y is another closing r.v” then 
for CC 



Y^lX n 

Jc 



J E(x I r) 


so that y closes on the left the submartingale X } ... ，々 ， ^ 1 * Finally ； 
for martingales all foregoing inequalities become equalities. The proo 

is terminated. 
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Various cases. Let us put together the properties of the various 
types of martingales and submartingales which are contained in 29.2A 
and 29.3A and B. In what follows r ^ 1. 

I Martingale sequence X '、 -- - 
Inequalities: 

EX x = EX 2 = … ； EX x + g EX 2 七 ^ E\X x \ r ^E\X 2 \ r 
Convergence. If lim EX n + < ① or lim EXrT < 00 , then X n X 

< + 00 or > — GO • 

Closure. The martingale is closed on the right by a r.v. Y C. L r //, 

and only if、the X n | r are uniformly integrable; then X n X and X 

is the nearest of the closing ： r.v.'s. In particular^ the martingale is closed 
by a r.v . 〔 L r when lim E\ X n | r < oo with r > 1, 

II Submartingale sequence X\ y X 2 、 … 

Inequalities: 

EX t ^ EX 2 S ; EX x + ^ EX 2 + S • • •; 

X n ^ 0 a.s. ==> EX\ ^ EX 2 S • • • • 

Convergence. If lirn EX n ^ < 00 , then X n > X K If 

sup E\ X n \ < 

%yi puvticuluT if 6ithcv every X n ^ 0 a.s. ov every X n ^ 0 a.s. utid lim EX n | 

< oo, then X n X finite. ^ 

Closure- If the \ X n | r are uniformly integrable, then X n — X [ 
and X is the nearest of closing r.v ： s. If every X n ^ 0 a.s., then the X n r 
are uniformly integrabley if y and only if y there is a closing on the right r.v. 
Y C L r , and there is one when lim EXn < 00 ^tth r > l* 

III Martingale reversed sequence * • *, X 2y X\ 

Inequalities: 

...= EX 2 = ^ EX 2 + ^ ^E\ X 2 \ r ^ E\ Xi | r . 

Convergence. If EXi^ or EXi < 00 , then X n — > A < 00 

or > — respectively. 

Closure, If E\ ^ then X n ^ X € L r and X is the nearest 

oj the closing r.v's. 
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IV Submartingale reversed sequence … ， X2 、 

Inequalities: 

… 4 EX 2 4 EX U ^^EX 2 + ^ EX x + ; 

every X n ^ 0 a.s. EX 2 r ^ EX\. 

Convergence. If EX\ + < then X n —> J\T < ». 

Closure. If the \ X n | r are uniformly integrable、then X n X L r 

and X is the nearest of the closing r.v. 9 s. In particular 、 X n X if and 

only if sup E\ X n \ < », equivalently^ E\ X\ \ < lim EX n > 一 ① • 
(see 36.1c). 

Remark, By using the decomposition of submartingales given in 
29.2, we can deduce their properties from those of martingales: 

1° Let X\y X 2 、• • • be a submartingale sequence with sup E\ X n 
<»• Then X n = X f n + X" n where X\ y X、’ • • * is a martingale se¬ 
quence with sup E\ X f n I < oo, and 0 ^ X ,f n \X n finite a.s. There¬ 
fore, X f n ^ X f finite and X n A X = X， + X” finite. 

2° Let … X 2 y X\ be a submartingale reversed sequence with 
E\ X\ I < 00 and lim EX n > —»• Then X n = X 9 n + X” n where … 
X f 2 y is a martingale reversed sequence with E\ X\ \ < qo and 0 ^ 

a a a m 

X f, n i 0 a.s” EX n x < 00. Therefore X\ X and X n X f . 

32.4. Applications. We use now the properties of martingales in order 
to extend various properties obtained in the case of independence. In 
general, we shall revert to P. Levy's form of martingale sequences 

X n ^ with E(Y k+ i I Yi, • • •, Y k ) = 0 a.s.; then (B n = …， 

k^l 

X n ) = ©(Yi，• • • ， Y n ) y and we set (B。= {0 ， ft}. 

We shall have use for a truncation of subscripts, first introduced by 
P. Levy and which transforms martingales into martingales. Let v be 
an integer-valued measurable function, finite or not, and such that the 
events [v > n] are defined on the first n terms of a sequence Yi y Y 2 ， … 
of r.v/s, that is, (^ > n] G We set Y f n = Y n I[ y ^ n ] y so that (R(Y f i, 
… ， F n ) C ® n and E{Y n+l | K，• • • ， Y n )= n+l) Z E(Y n+1 \Y u -^ 

Y n ). Thus, if every E{Y n+ \ j Y\ y • • •, Y n ) = 0 a.s., then E{Y f n+ \ | Y\ y 

n 

…， Y n ) = 0 a.s.，and the martingale sequence X n — ^2 Yk is trans- 

k^\ 

formed by the above “ 卜 truncation” into the martingale sequence 

n 

X f n = X y f k- Observe that, the (B n being closed under countable oper- 
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ations, we have 

[^ > «] C ®n ^ ^ C ®n ^ [^ = «] C ®n> ” 1 ， 2， • • • • 

L Zero-one laws. The zero-one laws of the case of independence 
extend as follows: 

Let Y y Yu Y 2 y ••• be r.v/s and apply 29.3A. The sequence Z n = 
E(Y\ Yu • • •, y n ) is a martingale closed on the right by Y whose ex¬ 
pectation is assumed to exist, say, EY + < oo. Since every EZ n + ^ 

EY + and hence sup EZ n + < oo, it follows that Z n —> Z < ». If 
£|y| r < oo for some r ^ 1, then Z n Z = E(Y\ Y\ y - - *) - If ， 

moreover, Y is defined on the Y ny then Z n y. To summarize 

A. If EY < ' then E(X\Y U …， Y n ) 二 Z < oo. 1/E\ Y\ r < oo 
for some r ^ 1, then E{Y\ Y\ y • * Y n ) —> E(Y \ Y\ y • • •)> ^hich 

X 

reduces a.s. to Y when Y is defined on the Y n . 

We specialize now these properties. \f Y ^ Iq where the event B is 
defined on the Y ny whence 5 is a ‘‘property’’ of the sequence, then 

P(B \Y U - •, Y n ) —^ Ib- (P. L^vyO 

In more intuitive terms 

The sequence P(B \ Y u - •, Y n ) of c*pr:s of a property B of the se¬ 
quence Yu y 2 > * # • > g^ n the first n terms of the sequence、converges a.s. 
to 1 or to 0 according as the sequence has or has not this property. 

In particular, if P(B \ Y u •- Y n ) = PB a,s. for every value of n 
(or for a sequence of values of n) y then PB ― Ib a.s. (Kolmogorov.) 
In more intuitive terms 

a. If the c.pr. of a property of a sequence of r.v's y given any finite num¬ 
ber of its terms> degenerates into a constant 、 then 、 a.s. y the property is either 
sure or impossible. 

Also Borel’s zero-one law extends as follows: Let 5 2 , • • • be a 

sequence of events and set ® n = ®(/b x > ••• ， IbJ. The two events 
IL ^B n < °°] and E P® w-l 5 n < °°] are equivalent (P. Levy). In more 

intuitive terms 

b. The number of occurrences of the events B n is a.s. finite or infinite 

according as the scries of their c.pv^s is a.s * finite ot infinite* 

Proof. The sequence X n — H Yk where Yk = Ibu ^ P^ k ~ l Bk (hence 

禽 * * i 

I I ^ 1 a.s-) is a martingale. Let ^ > 0 be a finite number and define 




[Sec. 32] 


FROM INDEPENDENCE TO DEPENDENCE 


65 


^ by [v 


[sup X k ^ a y X n > a] y [v 

k <n 


>]=[sup X n ^ a]. The 


卜 truncated sequence X\ = 53 is a martingale bounded above by 

k^l 

a + 1， and hence X’ n —~> X f finite. 

Since X n = X f n on [sup X n < a) and ^ > 0 is an arbitrary finite 
number, it follows that X n X finite on [sup X n < oo] except for a 
null event. Since changing the X n into —X n preserves the martin¬ 
gale property, it follows that X n — X finite on [inf X n > —oo] except 

for a null event. 

n n 

But 0 $ Z /办 T and 0 ^ P^ n ^Bk t a.s. so that both sequences 

have a.s. a limit, finite or not. If one of them is finite and the other is 
infinite, then sup X n (o)) = +oo or inf Xi(w) = 一 oo. Thus both limits 
are a.s. simultaneously either finite or infinite. The assertion is proved. 

n 

II. Convergence of series. Let X n = X ^ form a martingale 

k 魏 "l 

sequence. According to 29.2A 


P[sup \X k \>c]^E\X n 1 7 ，， 

k 


^ 1. 


For r = 2, the summands are orthogonal, EX n 2 = X E,Yk 2 y the in- 

sss 1 

equality reduces to the extended Kolmogorov inequality and yields 
the results of 29.1. The martingales convergence properties yield more, 
but the assumptions to be made are not easily expressed in terms of 
the summands. However, the 卜 truncation method yields a direct ex¬ 
tension of the convergence property established in the case of inde¬ 
pendent and uniformly bounded summands, as follows (P. Levy) : 

n 

If the summands of the martingale sequence X n — 阶 uniformly 

I A.S* 

bounded ( Y k | ^ r < oo), then X n ― > X finite if、and only if 、 the 
series E^ n ^ l Y n 2 is a.s. finite. 

Proof. Let ^ > 0 be a finite number and define v by [v — n]— 
[sup \ X k \ ^ a y \ X n \ > a] y [p ^ <^o] ^ [sup \ X n \ ^ a]. The sequence 

k <n 

n 

X f n = X) y f k of 卜 truncated summands is a martingale bounded by 

k 爾 1 

a + c < oo y and hence X f n X f for every r ^ 1. In particular, for 
r = 2, we have EX f2 = ^2 EY f n 2 < oo, so that the series 2^ £ <Bn-i y , n 2 , 
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whose expectation EX /2 is finite, is a.s. finite. Since E^ n ^Y f n 2 = 
E^ n ^ l Y n 2 on [v ^ n] and vanishes on [v < ”]，it follows that the series 

is a.s. finite on fj/ = ool = fsun I I ^ a\. Since a arhL 


Er n ^ l Y n 2 is a.s. finite on [v = oo] 
trary, this series is a.s. finite on [X n 
Conversely, define v by 


[sup J I ^ a]. Since a is arbi- 
► X finite]. 


^ t^ l Y k 2 > 


a ^ 


oo 卜 E ^- l Y n 2 ^ a]. 


The corresponding sequence of ^-truncated sums is a martingale bounded 
by a. Upon taking the expectations, it follows that 

B?\ X f n I g EX f n 2 g Z EY n 2 + 


so that, by 29.3A, X f n —4 X f finite; and, as above, X n —> X finite 
on [23 P^^Yn 2 < oo]. The proof is terminated. 

n 

III. Strong laws of large numbers. Let X n = Yk where the 

k 雄 1 

Yk are “conditionally exchangeable” with respect to addition (implied 
by ordinary exchangeability), that is, for every n and k ^ 

E(Y k I X ny X n ^ u … ） = E(Y 1 I X ny X n ^ u ...) a.s. 
According to 29.3B, if £| X\ r < » for some r ^ 1, then 

E{Y X I Z n5 Z n+1 , …）今 E^Y,. 

Therefore 


-^E (— ，…) =丄 I ： E(Y k I X n , Z n+1 , 

n \ n / n 


= £(7x1^^, .*•) T > E q Y x . 

To summarize 

n 

1/ X n — Yk where the Yk 财 e exchangeable and E Y\\ r < ^ for some 

S *S ♦ _ yf\ 

r ^ 1, then — — > E e Y\. 

n r 

IV. Independence. The foregoing results, in fact all the results of 
this chapter, were obtained under the guidance but not by the use of 
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similar results in the case of independence. Thus we have new proofs 
of the latter under the supplementary assumptions of independence. 

First consider results in I. Proposition a reduces to the zero-one law 
for tail events on sequences Y n of independent r.v.’s; since any tail 
event 5 C C n = (&(Y n ^i y Y n ^ 2 y •••) whatever be w, and (S> n and Q n 
are independent or-fields, it follows that P® n 5 = PB a.s. whatever be 
n. Similarly, proposition b reduces to the Borel zero-one criterion, since 
then P^^Bn - PB n a.s. 

n 

Now, let X n — where the summands are independent r.v/s 

centered at expectations. Then, in II, the inequality extends Kolmogo¬ 
rov's inequality for r = 2 to any r ^ 1, while the proposition proved 
there reduces to the fact that, when the summands are uniformly 
bounded, the series Y n converges a.s. if, and only if, it converges in 
q.m. (r = 2). As for III, it yields Kolmogorov’s strong law of large 
numbers, since then the tail <r-field is {0, a.s. and the limit is a tail 
function. 

To summarize, as had to be expected, the results in I, II， and III 
provide the basic convergence properties of the case of independence, 
but nothing new. However, if we do not limit ourselves to results ex¬ 
pressed in terms of summands, we get more (Marcinkiewicz). 


A. Let X n — ^Yk ^ consecutive sums of independent r.v.'s centered at 
expectations, and let 1. Then X n X C L r if, and only if, 

X n — X, 


r ft s 

Proof. If X n X y then, by 29.3B, X n X C L r . Conversely, 

let X n X G L r . Since r ^ 1, the r.v. X is integrable. Since, for 
every p y X n ^ p — X n is independent of X u …， X ny it follows that X — 
X n is independent of X\ y …， X ny so that 

E^ n (X - X n ) + ^X n = E、X - Xn) + Z n = £Z+Z n a.s. 

But Xn Xy while, by IA, E^ n X X Therefore, EX = 0 so 
that E^ n X = X n a,s. and the martingale sequence X n is closed by 

r 

X C L r . Thus, theorem 29.3B applies and, hence, X n X. The 
proof is terminated. 

♦32.5. Indefinite expectations and a.s. convergence. Convergence 
properties of martingales and, hence, all the applications of the pre- 
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ceding subsection are but particular cases of a convergence theorem 
that we establish now. 

We consider sequences X n of r.v.’s whose indefinite expectations <p n 
defined by 

<p n (A) » J X n> AC a, 

exist. We recall that © = ®(A ’!， * * •) is the minimal or-field over 
the field ©o = U …，尤 n )， and 6=0 ®d ， l+i ，•••） is 

the tail <r-field of the sequence X n . We introduce the following hy¬ 
pothesis. 

(H )： There exists a set function <p on Q such that、as n ^ <x> and then 
m 一 > po, 

n 

23 ^ki^mkC) —> <p(C f C) 

k =wi 

whatever be the disjoint in k events B m k G ®{Xn, • • •， 不 } such that 

n 

X B m k — C f ⑽ d whatever be the tail events C and C f . 

k=m 

If the foregoing events B m k are replaced by disjoint in k events 
Bkn C •…， ^n\y the so modified hypothesis will be denoted by 

(HO- 

a. Basic inequalities. Under (H) or (HO, 

<p{C a C) S aP(C a C) and <p(C b C) 1 bP(C h C) 

whatever be the tail event C and whatever be the finite numbers a y b in the 

tail events _ 

C a = [lim inf X n < a] y Cb = [lim sup X n > b]. 


Proof. Let a < a m la as m —> oo and set 

B m m = ^ ^m]y Bmk = \.^m ^ * * * 3 一 1 ^ ^ |> 

so that, as w 00 and then w — 

n 

2 Bmk = [ inf Xk < a m ] C a # 

k m ^ /c ^ n 


Since, for every event C, 


E <Pk(BmkC) * z f x k ‘ a m P 

k^m k^ttn J^mhC 
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it follows, upon letting n — co and then w — 》 00 , that (H) entails 

<p(C a C) S aP{C a C). 

The same inequality is entailed by (H’）upon setting B kn = [X n ^ a m> 

• • ♦，^ a m ，Xk < a m ] y and the first asserted inequality is proved. 
The second one follows by changing the X n into —X n and a into —b. 
The proof is complete. 

A. Basic convergence theorem. Under (H) or (HO, X n X 
a.s. finite above or below according as <p is bounded above or below. I/ 9 

moreover y <p on Q is <x-additive and cr-finite’ then X n —4 X = 

finite. 

dip 

denotes the C-measurable function whose indefinite integral is the 

P e -continuous part <p c of <p y and P e is the restriction of P to Q. 

Proof . Since the set D = [lim inf X n 9 ^ lim sup X n ] of divergence of 
the sequence X n can be written as a denumerable union ^ = U 
where 

Cab = [lim X n < a < b < lim sup X n ] 



and a, b{a < b) vary over all rationals, it suffices to prove that every 
event C a b is null. But, by taking C = C a b in the basic inequality so 
that 


it follows that 


QaCab = ^b^ab = 


bPCab ^ <p(C a b) S ^PCaby 


and, since 彡 > 泛 ， we have PC a b = 0. Thus, \f X = lim X n on D c and 

a.s. • 

we set, say, X = 0 on D 9 then X n — > X where is a 6-measurable 
function — not necessarily finite. If <p ^ c < ^o y then, taking C = in 
the second basic inequality, we have, for every finite 彡 > 0 ， 

P[X = + 00 ] = P[lim sup X n = + 00 ] $ P^b ^ c/b 0 as 々 — 00 , 


so that X < +00 a.s. Similarly when <p is bounded below, and the first 
assertion is proved. 

Let now <p be <r-additive and cr-finite* By taking, if necessary, a de¬ 
numerable partition of 12 into events of ^-finite measure, it suffices as 
usual to prove the last assertion for 沪 on (3 cr-additive and finite, hence 
bounded. Then X is a.s. finite, and we can take X to be finite by in¬ 
cluding the null event [X = 士 00 】 in the null event D and setting, say, 

X = 0 on D. 
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Let 

[k 

Cnk = 

— ^ < 
L2 n 

and set 


X ，， 

so that 



走 + 1 


k = 0 , ± 1 , zk2 y 


k: 


+oo k 

^ Yn Ic 


X f n ^ -AT < X f n + — on D c . 

2 n 

On the other hand, the basic inequalities with C replaced by CC nk D c 

走 + 1 是 

and a y b replaced by - , —•, respectively, become 

2 n 2 n 


• PiCC n m ^ ,(CC n ^) ^ it 


P(CC nk D c ) 


so that, by summing over k y we obtain 


以 cm 



It follows that 


cz,c^ ，n + r 


<p(CD c ) — — — — <P(CD c ) + 


and, letting n co : 


f X = j* X = <p(CD c ) = <p c (C ) ， C C 6. 

Jq JCD e 

Since X is C-measurable, the last assertion is proved. 

Variant. It may happen that <p is defined on the <r-field ® of the 
whole sequence X n and at the same time (H) or (H’）continues to hold 
when arbitrary C C 6 are replaced by arbitrary 5 C The so modi¬ 
fied hypotheses will be denoted by (H。）and (H’ 。 ) ， respectively. The 
same proofs continue to apply with B f s instead of C f s and we obtain 

a 0 . Under (H 0 ) (H ’ 0 )， 

<p{C a B) ^ aP{C a B) y <p(C b B) ^ bP{C h B) 


whatever be B ^(S> and the finite numbers a 、 b. 

a*St 

A 0 . Under (Ho) or (H^j X n — > X a.s. finite above or below accord¬ 
ing as <p is bounded above or below. If 、 moreover y <p on (R is <x-additive and 

a.s. dip • 

<T-finite y then X n — > X = a.s. finite. 

^ dr (& 
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Corollary 1 • Let <p on ©o be a <x-finite signed measure^ so that it deter¬ 
mines its cr-additive and <x-finite extension <p on (B. 

If (H) or (H ’） hold when C ^ Q are replaced by B 0 C ® 0 , then X n —> 
X = aj. finite. 

d 尸 (B 

Proof. The basic inequalities hold with C C 6 replaced by B 0 C ®o* 
Therefore, by continuity of P and <p y they hold with 5 G ® and the 
above variant applies. 

Corollary 2. Let 屮 on be a a-finite signed measure. If y given any 
€ > 0, fork sufficiently large 

I <Pk(B) - <p(B) I ^ ePB 


whatever be B 不 +i ， 

义 = 表(表). 


)(B G (R(X U •••, X k )\ then X n 


a.s. 


Proof. In the first case, hypothesis (H’ ） holds, since <p extends to (B 
and, for m sufficiently large, 


H, <pk{BknC) 


- 中 (i> c ) 


^ € Z PBknC ^ € 


as w > °o, then w —> oo, and then € —> 0. Theorem A applies, and 
the assertion is proved. Similarly, in the second case, hypothesis (H 0 ) 
holds, and theorem A 0 applies. 

Application to martingales. 1° Let the r.v/s X n form a martin¬ 
gale sequence or a martingale reversed sequence, closed by a r.v. Y 
whose expectation exists, that is, X n = E(Y | X\ y • • • ， X n ) a.s. or X n 
= E{Y I X ny X n+ iy • • •) a.s. Then Corollary 2 applies with 妒 indefinite 
integral of Y; in fact, in each case EqX u = EbY = (p{B)/PB. Thus 

X n —> X = E e Y or £®y, respectively. 

2° Let the r*v/s X n with sup E\ X n | ^ r < oo form a martingale 
sequence. Take for pr. space the “sample pr. space,” that is, the range 
space of the sequence together with its Borel field Ctoo and the pr. dis¬ 
tribution Poo of the sequence. Then (R(Xi y ••• ， X n ) is the <r-field Cfc n 
of all Borel cylinders whose bases are Borel sets in the range space of 
CSTi ， •… ， X n ). Since the X n form a martingale sequence, we have 
<Pn(^n) = <Pn+i(^n) = • • • for every A n C and, hence, <p n — <p or\ 
do — the field of all Borel cylinders in the range space* We apply the 
extension 4.3A ， 2 °. Since the indefinite expectations <p n of | -Xn | are 
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bounded by c y and form a nondecreasing sequence, it follows that 
lim <p n exists and is bounded and <r-additive on Cto- ^ fortiori y <p is 

bounded and cr-additive on 0t o . Thus, Corollary 2 applies and X n —> 

X 一 W • 

We seek now necessary and sufficient conditions for a.s. convergence 
of sequences X n of r.v/s. 

B. Dominated convergence criterion. Let \ X n \ ^ Y integrable. 

Then X n —> X (necessarily integrable) i/ 9 and only t/ 9 (H) or (H r ) or 
(H 0 ) or (H r 0 ) holds. 

Thus, if I Xi I ^ Y integrable and X n —> X y then all the hypotheses 
H are equivalent. 

Proof. The “if” assertion is contained in A and A 0 . Conversely, let 
X n X with indefinite expectation <p so that, for every € > 0, 

PB € = P[ sup \Xk — X\^e]-^0 as m <x>. 

m 

Whatever be the disjoint events 5* C ® varying or not with m and n 

n 

and whatever be 5 C ® such that Bk B y upon summing over 

k^m 

k = m y • • • y n the relations 

<p k (B k ) - <p(B k ) = f - X) =f BtB (^ - X) +f BtB ( X rX )， 

it follows from | Xk \ ^ Y integrable that 

|Z <Pk(B k )- <p(B) I ^ ePB € c + 2f Y^\<p(ZB k )-<p(B)\. 

k^m 

Letting ^ > oo, then m — 沈 ， and then € — > 0, the <c only iP’ asser¬ 

tion follows. 

C. Convergence criterion. A sequence X n of r.v's converges a.s. to 
a r.v. ify and only if 、 for every € > 0 there exist events B € with PB € > 1 — € 
on which \ X n \ ^ Y t integrable and (H) or (HO or (H 0 ) or (ET 。） holds. 

Proof. If for € m 10 as w oo, we have | ^T n | ^ Y fm integrable and, 
say, (H) holds on B m = B €m ; then by B, the sequence X n converges to 
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a r.v. on B m — N m where N m are null events; hence it converges to a 
r.v. on U B m — y N m . Since, for every integer m f 9 

P\JB m ^ PHI - € m ,， 

it follows that P (J B m = 1 and the “if” assertion is proved. 

Conversely, let X n — X r.v. and apply Egorov’s theorem which 

asserts that for every € > 0 there exists an event B with PB < - such 

that X n X uniformly on B c . Let n and r > 0 be sufficiently large 

so that, on the one hand, | X n \ < | + 1 on B c and, on the other 

hand, 

PC = P[\ X\ > c]<- 

Then | X n < r + 1 on B C C C and, by B, (H) holds. Since 

1 一 PB C C C = P(B U C) ^ PB + PC < e 9 

the “only if' assertion is proved. 

COMPLEMENTS AND DETAILS 

1. Let S n = Xk where E(X n | X\ y • • •, U = 0 a.s. ( 义 。 = 0) and 

E(X n 2 I X\y •••， Xn-^i) = <r 2 X n a.s. Find conditions for degenerate and for 
normal convergence of suitably normed sums. 

2. Take one by one the results relative to the degenerate, the normal, and 
the Poisson convergence obtained in the case of independent summands, and 
transpose them to the case of dependent summands by using successively the 
various approaches given and illustrated in the text. 

3. Let Sy n = Xk 、 Xq = 0. The summands are independent with common 

h »vl 

ch.f. / and finite a = EX ny a 2 = cr 2 X n . The v n are integer-valued r.v/s inde¬ 
pendent of all the summands with p n k = P[v n = k] y 走 = 0 ， 1， • • •，and with 
finite a n = Ev nj j8 n 2 = <rV n . Set <r n 2 = cr 2 ^ n and let g n be the ch.f. of 
UJAn. Then 

Sn(u) = pnke^ ia ^ ! ^fk{u/(T n ), or n 2 = a n a 2 + a 2 ^ n 2 . 
ifc-0 

Let <r» 2 — oo, iS n 2 = O(or„ 2 ). Then g n (u) is of the form 

2 

gnW = h{c n u)e 2 + o(\)y Cn = ^nj3 n /(Tn. 

If also a 2 ^ n 2 = o(a n ) y then g n (u) 一 r u2/2 . 

If also £(u n ) 〜 9^(0：” j3 n 2 ) y then £(S y J ~ fR(aa n , <r n 2 ). 
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What happens when £(j/ n ) is a Poisson law (P(«), or when £(j/ n ) is a binomial 
law with 

4. Search for conditions under which the various limit laws obtained in the 
case of independent summands remain the same when the numbers of sum¬ 
mands are r.v/s v n independent of the summands. What happens when 

a.s. 

V n —— > 00 ? 

5. By following the indications given in the text transpose to the case of 
dependent summands as many as possible of the a.s. convergence and a.s. 
stability theorems obtained in the case of independence. 

6. Let Y be an integrable r.v* defined on a sequence X n of r.v/s. Take for 
pr. space the sample space of the sequence: 12 = XI d-Borel field in 12, 
P — pr. distribution of the sequence. 

If the X n are independent, with pr. distributions P ny then 

/ SI S 

YdP n ^dP n ^^- —> y 

/ ft s 

YdP^^ dP n ^i —> EY. 

What becomes of the integrals when the X n are dependent? 

7. A net is a sequence of countable partitions 0 = 21 into events such 

j 

that every partition is finer than the preceding one. Every partition deter¬ 
mines a onfield (B n , and (B n T • Let ^ on U © n be bounded and let it be <r-addi- 
tive on every ® n . Set X n = IZ ^njlAnj with x n j = <p{A n i)/P{A n ^) and throw 

t j 

out the null union of all null events A n ^ 

The sequence X n is a martingale and its a.s. limit, if it exists, is called the 
derivative of <p with respect to P given the net. 

If <p is <r-additive on \J (B n , then it extends to on (B bounded and <r-additive, 

d it) 

and X n X — -jp- . The X n are uniformly integrable if, and only if, <p is 

a.s. 

P-continuous and then X n X. 

Particular case. Let 12 = [0, 1], P is the Lebesgue measure, <p is determined 
by a function // on Q of bounded variation. Consider a net of partitions into 
intervals such that the length of the largest converges to 0* Then (B is the 

Borel field in Q and X n —— > //^-derivative of H known to exist a.s* 

& If 兄 on R is a Lebesgue integrable function of period 1 and ^ n (x) = 

— 《( x + 5)，then g —I g(x) dx. (Let Q = /?, (2 = <r-field of Lebes- 

gue sets A of period 1, PA = Lebesgue measure of a period of A. Let (B n be 
a similar <r-field of sets of period l/2 n . The cr-field p| (B n consists only of sets 
of pr. 0 or 1. Use martingale reversed sequences convergence theorem.) 
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or 


9. Let 妒 on (2 be bounded and cr-additive* Then X 

dip 


dip 


a.s. 


X 


dip 


dP 


e 


dP 〜 dP 也 

according as ® n T or ® n | . (Either start with 妒 ^ 0 and observe that 


the sequence —X n is a submartingale; or first extend the basic convergence 

d<p n 

theorem 29*4A to X n = where <p n = <p n c + <p n 8 and X n a.s. finite with 
(O 士⑷ = V>n{A[X = zfcoo]).) 

10. Let w, « —> oo and let r ^ 1. Let © m , (B be sub-<r-fields of events with 
®mt® or ®ml® 

(a) X n ^ XCL r ^ E(X n I ® w ) 4 E(X\(S>). 

For, by the rr-inequality and martingales convergence, E\ E(X n j (B w ) — 
E(X I ®) M c r E\ E(X n - ^ I ® m ) h + CrE\ E(X | © m ) - E(X | ®) 

CrE\ X n -X\ r + c r E\ E(X ! ® m ) - E(X 丨 ®) 卜 — 0 . 一 

(b) O^Xn^XCLr^ E(X n \ (B m ) E(X\(R). 

T 

Use (a) and, by martingale convergence and conditional monotone con¬ 
vergence, 

inf E(X k I (By) ^ inf E(X n | (By) =» lim inf E(X n \ (B m ) 

j ^m t k^n j m t n 

^ lim lim E(X n \ (B m ) = £(Z|(B)a.s. 

n m 

sup E(X k I (By) ^ sup E(X I (By) =» lim sup E(X n | (B m ) 

j ^m f k^n j m f n 

^ E(X\ (B m ) - E{X I «) a.s. 

(c) inf Xn^iLry lim inf X n €.L r ^ £(liminf X n |(B)^ lim inf E(X n | (B m ) a.s. 

m，n 


n 


n 


n 


sup X n c Lry lim sup X n L r ^ lim sup E(X n | ® w ) ^ £(lim sup X n | ©) a.s< 

m，n 


n 


n 


n 


a.s. 


Use (b) as in the Fatou-Lebesgue Theorem. 

a a 

⑷ sup \x n \c Lr, X n X E{X n I (B m ) ^ E(X\(R). Use (a) 
and (c). What if Z n 4 Z? " 




ERGODIC THEOREMS 


Ergodic theory has a phenomenological origin which, on account of 
the Liouville theorem, leads to the study of measure-preserving trans¬ 
formations. The ‘‘classical period” (1930-1944) is concerned with one- 
to-one measure-preserving transformations T\ of a measure space onto 
itself (in what follows we limit ourselves to a pr. space (fi, Ct, P) and 
leave out results with which we are not concerned). Let X 9 Y be r.v.’s 
and define Z n , Y n , by X n (u) = X{T x n ^) y Y n {oi) = Y{T x n ^). 

The first two basic results are 

• 1 n 

von Neumann’s: if C 乙 2 , then - ^2 Xk converges in q.m. 

W fcaasl 

• • 1 n 
Birkhoff’s: if C 厶 1 ， then — ^2 Xk converges a.s. 

^ fcsssl 

Then Khintchine gets rid of the supplementary but unnecessary as¬ 
sumptions made by Birkhoff and, at the same time, simplifies very con¬ 
siderably his proof; Hopf extends this theory to ratios of sequences 

n n 

^2 Xk/^2 Yk and proceeds to a systematic investigation of ergodic 

1 k 篇 \ 

properties; Yosida and Kakutani extend von Neumann^ theorem to 
transformations on Banach spaces and apply them to Markov chains- 
The “modern period” (1944 - ) is characterized by several weaken¬ 
ings of the ergodic setup. Hurewicz and Halmos abandon, at least 
partly, the initial setup and start with set functions from which r.v/s 
are derived, whereas Dunford and Miller abandon definitely the meas- 
ure-preserving property and obtain necessary and sufficient conditions 
for convergence in the first mean. Then, F. Riesz gives very simple 
proofs of the Birkhoff theorem and of the sufficiency part of the Dunford- 
Miller theorem, and, at the same time, he abandons the one-to-one as- 
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sumption about transformations. Doob applies Birkhoff’s theorem to 
stationary chains. Finally, Hartman and Ryll-Nardzewski, employing 
methods developed by Y. Dowker in investigating “potentially invari- 
ant” measures, obtain necessary and sufficient conditions for a.s. con¬ 
vergence on L\ to L\. 

In pr. theory we are interested not in the behavior of averages of se¬ 
quences {X n \ due to any r.v. X L\ but in that of individual sequences. 
Stripped of all considerations due to their phenomenological origin, er- 
godic theorems assert conditions, in terms of point or, more generally, 

. . 1 n 

set transformations, under which averages - X of r.v.’s converge 

^ »l 

in some sense and, then, assert conditions under which the limits are 
degenerate. In the case of independence, such theorems reduce to 
limit theorems as expounded at length in Part III. Thus, Bernoulli’s 
law of large numbers is to be thought of as the first ergodic theorem 
with convergence in pr. and Tchebichev’s proof as an improvement of 
the conclusion — with convergence in q.m. Above all, Borel’s strong 
law of large numbers is to be thought of as the first “best” ergodic 
theorem — with a.s. convergence. 

What characterizes the ergodic method — as compared with those of 
Part III—•is that the conditions for convergence are to be in terms of 
iterated translations，in a sense to be made precise in this chapter. 
We shall establish a basic ergodic inequality and deduce from it ergodic 
theorems which contain the mentioned results (sometimes improved or 
completed). The reader will recognize the proofs (and hence the state¬ 
ments) which remain valid when P is replaced by a cr-finite or an arbi¬ 
trary measure m* 


§33. TRANSLATION OF SEQUENCES ； BASIC ERGODIC 

THEOREM AND STATIONARITY 


*33.L Phenomenological origin. Let q = (q u … y qN、，P = (pu • • •， 
p^) be the generalized coordinates and momenta of a “conservative” 
mechanical system with N degrees of freedom. The equations of mo¬ 
tion of the system are 


dqi dH dpi 

dt dpi dt 


dH 

dqi 


/= 1，...， 况 


where H = H(q y p) is the Hamiltonian of the system, independent of 
time /• The “states” of the system are represented by points co = (q y p) 
of the 2A^dimensional real space 12 — the ‘‘phase space’’ of the system. 

Under the equations of motion，every point w C ^ which represents 
the “initial state” 一 at time / = 0 — moves along a trajectory and this 




78 


ERGODIC THEOREMS 


[Sec. 33] 


motion describes the evolution of the system; the state at time / is the 
point Thus the phase space can be envisioned as a fluid in motion 
within itself. The celebrated Liouville’s theorem asserts that in this 
motion the Lebesgue measure X of Lebesgue sets A remains invariant. 
More precisely, in a unit of time the phase space undergoes a one-to- 
one transformation co T\oo such that \T\A = \A. To simplify, we 
consider only discrete values of time 0, 1 ， 2， • • •• In fact，the conserva¬ 
tive systems have constant energy E and the possible trajectories lie 
on the surface H = E whence the term “ergodic” from ergos meaning 
energy; then the invariant measure m is defined by d\x = ^/cr/grad H 
where dcr is the differential area of an element of the surface and grad H 
is taken at a point of the element. 

The comparison between theory and reality is made by measuring 
“observables” or “phase-functions” 一入 -measurable functions of the 
state. The ideal would be to observe the consecutive states (their com¬ 
ponents are phase functions). Yet，in statistical mechanics, insurmount¬ 
able difficulties of experimental as well as computational nature arise. 
In the systems under investigation the number of constituents or “par_ 
tides” is extremely large and so is the number of degrees of freedom，as 
well as the number of microscopic phenomena in a unit of time of the 
observer (such as collisions of particles between themselves and with 
the walls of the container). Thus from the microscopic point of view, 
the time required by the observer to measure an observable X is ex¬ 
tremely large, and its observed values are to be compared not to its 
instantaneous theoretical values but to the theoretical time-averages 

= - { X{c^) + Z(7» +••• + XiTx^o))) 
n 

for n extremely large. However，computation of time-averages re¬ 
quires knowledge of consecutive states T\ n o) given the initial state o). 
Yet the exact knowledge of the initial state is experimentally unattain¬ 
able. Even if it were attainable, then theoretical knowledge of suc¬ 
ceeding states would require integration of an extremely large number 
of equations of motion — which is practically impossible. Thus，some 
other way of evaluating theoretical time-averages is to be found or pos¬ 
tulated. Physicists were led to replace the exact initial state by the 
set of all possible states compatible with the precision of the experi¬ 
mental data and to postulate equality of time-averages with phase- 
averages over the multiplicity described by the trajectories of the ini¬ 
tial states compatible with the data; this is the “ergodic hypothesis •” 
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From the physicist’s point of view, the justification of the ergodic 
hypothesis lies in its practical success. For the mathematician, ergodic 
theory was at first an attempt to justify theoretically the ergodic hy¬ 
pothesis. But bitter experience imposed supplementary hypotheses. 
Ergodic theorems assert conditions under which sequences of time-aver¬ 
ages converge in some sense and then conditions under which the limits 
are independent of the initial state and reduce to phase averages. From 
the experimental point of view, comparison between theory and obser¬ 
vation will be best when measurements of the observables yield approxi¬ 
mate values of the time-averages, that is, when the convergence will 
be an everywhere or at least an a.e. convergence. 

33.2. Basic ergodic inequality. Let the pr. space (S2, Cfe, P) be fixed. 
On a family of one or several sequences of r.v/s, say {X n } y {Y n } y we 
define Borel functions of the family, say These functions are measur¬ 
able and, more precisely, are (B-measurable where (B is the sub (r-field 
of events induced by the family. The translate by k — \ oi ^ \s the 
function ^ (so that = 专 ) obtained by adding k 一 \{k = 1 or 2 or …) 
to the subscripts of all those r.v.’s of the family which figure in the defi¬ 
nition of ^ Thus & is defined on the family {X n+ k-x} y { Y n +k ^\} —— 
the translate by ^ — 1 of the given family —— exactly as ^ is defined on 
the original one. We say that 专 is invariant (under translations) if it 
coincides with all its translates. Translations of indicators of events 
of © define translations of the events; in other words, the above defini¬ 
tions and notation apply to events B C © — replace ^ by 5 and ^ by 
5*. Because of the definition, translations preserve countable set oper¬ 
ations on events belonging to (B. It follows that the class of all invari¬ 
ant events is closed under countable set operations, hence is a sub a- 
field Q of events 一 the invariant cr-field defined on the family, and invari¬ 
ance of measurable functions defined on the family means 6-measur- 
ability. 

The concept of translation can be interpreted by means of the range 
space of the family: the Borel space of points (x\ y y\ y x 2i ) 2 ，• • •)• For 
example, the translate [Xk < Yu\ of [X\ < Y\] is obtained as follows. 
The event [X\ < Y\] is the inverse image (under the family) of the 
Borel set [x\ < 力 ] in the range space. Then the translate [Xk < Yk] 
is, by definition, the inverse image of the Borel set [x^ < y^. In fact, 
to avoid any ambiguity, in what follows it suffices to think of the pr. space 
as being the sample pr. space of the double sequence. 

Convention and notation. In this chapter we shall reserve sub¬ 
scripts to indicate translations and otherwise use superscripts — not to 
be confused with power indices. Denote by (B the (r-field of events in- 
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duced by the family [X n } y {Y n } y by Q the (r-field of invariant events, 
and set 

X n = J ： X ky = Y k . 

A awl k^asa\ 

To avoid undefined ratios — and to make the limits independent of an 

arbitrary but finite number of the summands, we assume once and for 
all that 

Y n >o i y”oo. 

Then, from 


义 2 + ••• + ^ n+1 X n+l - Zx Z n+1 


Y n+l 


K + … + Y n+l Y n+l - Y x Y n+l Y n+l - Y x Y n+l - Y x 

it follows that lim inf— and lim sup — are invariant and, hence, so 
are the sets of convergence and of divergence of the sequence — : 


..X n X n 

lim inf —— =lim sup —— 

Y n y Y n 


• • X n • X n 

lim inf —— _ lim sup —— 
yn H yn 


as well as the events 


Qa = 


lim inf —— < a 

Y n 



r 6 = 


Umsupf 

y Y n 



So far, the defined concepts do not contain the pr. P; they are expressed 
only in terms of the measurable space (Q, (i) and of measurable functions 
on this space. However, when the r.v.’s represent points of Z^-spaces ， 
the transformations have to be equivalence-preserving. It suffices for 
the translates of null events to be null, to replace the assumptions on the 
Y n y s by Y n > 0 a.s. and y n | oo a.s. y and at the same time to replace in¬ 
variance by invariance — invariance when a null event is neglected. 

We establish now the basic inequality from which we deduce the 
basic ergodic theorem. But first we require an elementary lemma. 
Let a X} a 2y • • • ， a n+m be finite numbers. We say that is m-positive 
if at least one of the sums • • • containing no more than m 

summands is positive. In symbols, is ^-positive if sup (泛灸 + • • • 
+ A) > 0 for ^ ^ ^ min (是 + w — 1， ” + ;??)• 



[Sec. 33] 


ERGODIC THEOREMS 


81 


a. F. Riesz’s lemma. If there exist m-positive ierms y then their sum 
is positive. 

Proof. Let be the first ^-positive term and let H - h di be 

the shortest positive sum starting with If one of the terms of 

this sum is not ^-positive, then H - h ^ 0 so that + • • • 

+ a^—i > 0 and the sum is not the shortest positive one. Thus, all 
its terms are ^-positive. Hence, the successive ^-positive terms form 
disjoint stretches of positive sums. The assertion follows. 


A. Basic ergodic inequality. If B m 


X j 

s ^Pyy > 

then for every integer n and every invariant event C 


and Z v > 0, 


ifc«l ^Bk m C\Z n Z n ) 


n+m 

E 


X(l 



^0, 


provided the sum exists. 


Proof. Let k = 1 ， 2, 
and let 


y n 七 m y k S l S min (々 + — 1 ， 《 + ?n) y 


B mk = [Xk — bY^ is 彷 -positive]= 


Xk + • • • + Xi 

sup -- > 

i Yk Yi 


If k ^ n y then / varies from k to k + m — l and, hence, B mk = 
where Bk m is the translate by 々一 1 of B m . Since by Riesz’s lemma 


n+m 


e ( 為一 ^ o 


k. 


a fortiori 


n 




53 {Xk — bYk)lB k m c H ~ bYk) + Ic ^ 0, 

Afsasl Ajsaw7l + 1 


the asserted inequality follows upon dividing by Z n and integrating. 

*Let A with or without affixes denote events defined on the X n y s 
arid s* 


B. Basic ergodic theorem. If 


n 


0) 


y I ^ 
k z x Ja, 


0 and 


云 X F — °， w ” 


oo and then ^ i 0, 
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(ii) 



I ^ n+ ,| 


f 


Y n 

Yn+k 

Y n 

X n 


0 and 


0 ， for every fixed k as ” 一 ^ oo: 


then the sequence — converges a.s. Assumption (i) implies that、from 
some n on y the sum therein exists. 

Proof. We have to prove that the invariant event 


D 


X n 

lim inf — 9^ lim sup —— 

Y n y Y n 


is null. Since D = (J C a b is the denumerable union of the invariant 

a，6 

events 

lim inf —— < a < i < lim sup —— 

Y n y Y n 


^ab 一 


where a < b vary over the set of all rationals, it suffices to prove that 
every event C a b is null. 

We set in the basic inequality Z n = Y n and 

C= C ah = B m C ab + 

Because of the invariance of C a by translation by ^ — 1 yields 

Cab = Bk m Cab + ^k m y 


and the basic inequality becomes 

f f (么一 

Jc ah \Y n / kl\JAir\Y n 


b 



T f 

fc—n+l \y n 


b 





^ o. 


Since by definition of and C a b we have B m C a b t C a b and hence 
j 0 as w > oo, it follows because of (i) and (ii) that, upon letting 
w oo and then m 忒 、 the foregoing inequality becomes 


lim inf 


f 


; 2 0. 
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By changing the Xk into —Xk y b into —a，a into —b 、this inequality 
becomes 

lim inf" y u 一 : ) ^ 0. 


Therefore, by adding up the two last inequalities, we have 

(a — b)PC a b ^ 0, a < 

so that PC a b = 0 and the proof is concluded. 

Let Z n be positive r.v.’s such that for every invariant event C de¬ 
fined on the X n and Y n 


(C) 


..r Y n 

lim inf I — = 0 ==> PC = 0: 

J c z n 


this is certainly true ifZ n = Y n . It follows that, if throughout the pre¬ 
ceding proof we divide by Z n instead of by Y n y this proof remains valid. 
In other words, 

B' The basic ergodic theorem remains valid if in the assumptions there¬ 
in Y n are replaced by Z n obeying condition (C)* 

33.3. Stationarity. To the concept of invariance correspond weaker 
ones of invariance in terms of integrals and in terms of pr.’s. 

Let {X n } y {Y n }y be the family of r.v.’s on which the translations are 
defined. Let the events A y B y C, with or without affixes, be defined on 
this family, that is, belong to the afield © induced by the family. Let 
专 ， with or without affixes, be a r,v. defined on the family. We say that 
^ is integral invariant or that the sequence of translates • • • is 

integral stationary if the integrals of the “ exist and if, for every A and 
every k y 


^ Ak ^ A\ 

We say that ^ is P-invariant or that the sequence ^ • • • is stationary 
if, for every A defined on it, 

PAk = PA\. 

In particular, the family itself is {integral) stationary^ if the sequence 
X\ y 不 ， • • • Y\ y Y 2y • • • is {integral) stationary. It follows from the 
definitions that the sequences of translates of all r.v.’s 专 defined on a 
stationary family are stationary. Furthermore 
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a. Stationarity lemma. The sequences of translates of r.v's y defined 
on a stationary jamily, and whose expectations exist，are integral stationary. 

Proof. We have to prove that, for every A and every k y 



provided the right-hand side exists. As usual, it suffices to prove it 
for indicators 专 i = I Bv Since set operations are preserved under trans¬ 
lations, it follows, by stationarity, that 

f lB h = P^kBu ^ P{A\B\)k = PAB = f I Biy 

JA h JAi 

and the lemma follows. 


In the integral stationarity case, the basic ergodic inequality takes 
very simple forms. 

b. Stationarity inequalities. Let the family [X n } y [Y n } be inte¬ 
gral stationary and let X\ or Y\ be integrable. Then for every invariant C 

f (aVx -X x )^0 and f (X x - iY x ) ^ 0 
JCCa JcU b 

where^ for X\ and Y\ integrable on C y C a and ^6 (< ^ and > b) can be 
replaced by B a = [inf X n /Y n < a] and Bb = [sup X n /Y n > a and 
^ b). _ 

Proof. On account of integral stationarity and of integrability of 
X\ or Y\ hence of all the Xk or Yk y the integrals below exist and 

f (X k - bY k ) = f(X x - bY x ). 

JAk J A\ 


It follows from the hypotheses that the basic ergodic inequality, with 
Z n = n y can be written 



m 


(X\ — ^Y\) + 一 I ( 不 —^ 0 

B m c n Jc 


where, as w oo 


R 




X j 

sup —: > b 

Y J 








[Sec. 33] 


ERGODIC THEOREMS 


85 


Therefore 




(不一 bY x ) ^ 0 


since, either I (又 i 一 ^Y\)^ = oo and this inequality is trivially true, 


/. 

』 CCb 


ing C by CVb and letting w — oo then m — 忒 . By changing b into 
— a and X u X 2 , … into —X\ y 一 X 2 、 … y the other asserted inequality 
follows; similarly for the remaining assertions. 

When the family is integral stationary and X\ and Y\ are integrable ， 

• • X n a.s. • • 

then 30.2B( with Z n = n y yields — > U invariant. However, by 

using directly the basic ergodic inequality in its foregoing forms, the 
assumptions can be weakened and the result be made more precise, as 
follows. 

A. Integral stationarity theorem. Let the family [X n \ y {Y n } be 
integral stationary y and let X\ or Y\ be integrate. Then 


X\ + . •. + X n 


U invariant. 


If X\ is integrable^ then U is aj. finite. 

If Y\ is integrable y then U = E e Xx/E e Y\ a.s 

Proof. Upon replacing C by 


Cab 


x n x n 

lim inf - < ^ < ^ < lim sup - 

Y n ^ Y n 


the stationarity inequalities become 


so that 


_ a 。， 




^ o. 


Since a < b and > 0 a.s., it follows that PC a b = 0. Since 

■ X n X n ^ 

D 苗 lim inf — + lim sup — = U^a6j 

_ y ^ . a 1 6 
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where a and b > a vary over all rationals, is a countable union of null 
sets, PD = 0 and the first assertion follows. From now on, we throw 
out of the null set Z); this does not modify the values of the integrals 

below. /% 

According to the stationarity inequalities, if I X\^ < oo, then, as 


fiu^ 


+/ 1 丄 Xl 


and, since Fi > 0 a.s., P[U = + 00 ] = 0; similarly, if J XC < 00 then 

P[U = — 00]= : 0* The second assertion follows. 

Let Y\ be integrable and set 


C 171 = [(w 一 1) e ^ t/ €>0j 


so that 


+00 


E c m = [| u\ 〈的 ] • 


The stationarity inequalities with a y C replaced by m€ y (m — l)e : 
CC m , respectively, yield 


-L^ Xi 




while, by definition of C m , 


{m — l)e r Y\ ^ f UYi ^ we j Y\. 

Jcc m Jcc m Jcc m 

If U is finite, then, by summing over m = 0, 士 1 ，士 2, • • • and taking 

into account that I X\ exists, and I < oo, we find that 

J n Jr 


/ 不 

J C 


€ Fx ^ UY X S\ Xx + elYx 

Jc Jc Jc 


and, by letting € —> 0, it follows that 



UY X 


i >， 


CG e. 


Since U is ©-measurable, we can write 


UE e Y x = E\UY X ) = E e X { a.s. ; 
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and, since Fi > 0 a.s. implies that E Q Y\ > 0 a.s., we have U = 
E Q X x /E e Y x a.s. 

If U is not finite, the above equality of integrals continues to hold 
provided C is replaced by C[\ U\ < oo] 5 so that U = E e X\/E Q Yi on 
[I … < oo] outside a null subset. But, by the stationarity inequalities 
with C replaced by C[U = +oo] ? we have 


f 

JC[U- 


+ 00】 







C[C ； --foo] 


so that, by letting ^ ^ oo, if PC[U = +oo] > 0, then 



e 

E X x 


(7[C/»+oo 】 


and hence E e X\ = +oo on [U 
E e Yi < oo a.s., it follows that 


+oo] outside a null subset. Since 


t/= +oo = E^X x /E^Y x on [U= +oo] 

outside a null subset; similarly, for [17 = —oo]. Thus, U = E Q X\/E e Yx 
a.s. whether U is finite or not, and the last assertion is proved. 

From now on, we consider only families consisting of one sequence 
so that the events and translations are defined in terms of the 
{X n }. We use the fact that if EX x exists, then, by the stationarity 
lemma, stationarity of [X n ] is equivalent to integral stationarity of 
the family {X u X 2y • • •} and {1, 1, ••• }• 

B. Stationarity theorem. Let the family [X n \ be stationary. 

If EX\ exists y then 

X\ + • • • + X n a.s, e 
—^ - — E X x . 

n 


IfE\X x Y <00 for an r ^ 1, then 

X\ + • • • + X n r 


E e X x . 


Proof. The first assertion follows from the integral stationarity theo¬ 
rem with Y n = 1. The second assertion will follow from the first one 
on account of the Z r -convergence theorem, if we prove that the rth 


powers 


of I 則 


一 (Xx ^ - h X n ) are uniformly integrable. But, 

n 
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by the stationarity lemma and integrability of | 不 | r ，for every 



[I X k \^c] 


1^1 





I 对 < 在， 


where 6 > 0 is arbitrarily small and c ^ c € sufficiently large. There¬ 
fore, for PA sufficiently small and for every k y 


Jl^l 



Al\ X k \^c] 


I 对 



Al\ Xk <c] 


\X k \ r s e + c T PA < 2e 


and hence, by Minkowski’s inequality, J | X n \ r < 2e whatever be n. 

By the same lemma and inequality E\ X n | r ^ E\ X\ | r . The assertion 
is proved. 

Corollary. If \X n ) is stationary^ then 

H - h X n 丄 _ 


E e X x 


e 

E X x 


on the set on which the right-hand side generalized c.exp of X\ exists^ out¬ 
side a null subset. 

Apply the stationaritv theorem to - + • • • + X n ^") and to 

" n 

- + • • • + X n ~)- 

n 

Remark 1. In the case of integrable X\ y the equality X = a.s. 
lim JC 1 = E e X\ a.s” results also from convergence in the first mean, 
implied, according to the above theorem, by £| -STi | < °o. For then 
we can pass to the limit under the integration sign, so that 


/ 不 
J c 



lC n 


f /， 


CC e. 


Remark 2. It is useful to observe that convergence in the rth mean 
follows from (1) bounded sequences which converge a,s. (or even only 
in pr.) converge in the rth mean, and (2) bounded functions are dense 
in L r . The first property is immediate. The second property is ex¬ 
ploited by setting 

不 =+ ”1， = = 丨之 c |， 

so that E\ X\\ r < implies that as 历， w — then r — oo ， 

ll^-^n^llr-rll + ll^ll + IRII 


^ Hr-rll+ 211^11 
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33.4. Applications; ergodic hypothesis and independence. We say 
that a sequence X n of r.v.’s is indecomposable if all invariant functions 
defined on it degenerate into constants or, equivalently, if its invariant 
cr-field consists of 0 and only, up to an equivalence. We say that the 

• . • 1 n as 

ergodic hypothesis is true for the sequence X n if — X) ^ — > what- 

ever be the r.v . 专 G 乙 defined on the sequence. Since for an indecom¬ 
posable sequence a.s., we have, on account of the station 枉 rity 

theorem, 

If the sequence X n is stationary^ then the ergodic hypothesis is true if 、 
and only if y the sequence X n is indecomposable. 

Let the r.v.’s X n be independent. Then the sequence X n is station¬ 
ary if, and only if, the X n are identically distributed. On the other 
hand, by the zero-one law, its tail <r-field reduces to 0 and 12 up to an 
equivalence and, the invariant events being tail-events, the sequence is 
indecomposable. Thus 

The ergodic hypothesis is true for sequences of independent and identi¬ 
cally distributed r.v. f s. 


In particular, if E X\ < then — ^Xk 

” k 麵 i 


EX\ finite (and, 


moreover, converges in the first mean). Conversely, if 一 ^ Xk > c 

a.s« 

finite, then X n /n ― > 0 and, by Borers zero-one law, 

E\X 1 \^l + ZP[\X n \^n]<<^. 

Thus, we have Kolmogorov’s strong law of large numbers. In fact, 
it can be made more precise, as follows: 

Let the r.v.’s X n be independent and identically distributed. If EX\ 


exists y then 一 > EX\. If - Y, ^ — > X necessarily digen- 

w ti k 祖 l 

erate at some constant c y then c finite implies that EX\ exists and is finite y 
while c = +oo ( — oo) implies that EX \ + {EX\~~) = +°°. 

First observe that degeneracy follows by the zero-one law. 

The convergence assertion follows from the stationarity theorem. 

1 n as. . ， ^ ,. 

If 一 53 Xh ~ > c finite, then, according to the converse above, c finite 

tl kmml 


X necessarily dtgen 
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implies that EXi exists and is finite. If - ^ Xk c = +oo, then 

n k =i 

1 n 1 n 1 n as 

ex x + ^ - zx k + = - zx k + - zx k - ^ +oo + Exr 

n k^l n tl A;=l 

and, hence, EX x + = +«?• Similarly, ‘if c = —oo, then EX\~ = oo. 
The proposition is proved. 

Because the ergodic hypothesis cannot be true for decomposable 
stationary sequences, the brutal answer to the ergodit problem is in 
the negative. However, its wreck can be salvaged in various ways. 
One way is to assert that it is true “in general” with a suitable defi¬ 
nition of this term — the best is a category definition. Another ap¬ 
proach can be stated as follows: Observe that when the sequence X n 

i 'n 

is stationary, then, for all events B defined on it, - ^ I Bk —V P e B. 

” /Q'mi X 

If the decomposition theorem 26.2A applies, then 

P C =ZPBrlBn Psfit = 1, 

t c T 

provided a null event A^is thrown out of Then the drgodic hypothe¬ 
sis is true, provided P is replaced by any of the P Bt Into which P is 
decomposed, or fi is replaced by any of the B t . This i 衮 the ergodic de¬ 
composition. 

*33.5. Applications ; stationary chains. Let X n be a sequence of 
r.v/s (or more generally random vectors) with same rartge space R and 
let P n on the Borel field ® in 及 be the distribution of ； X n . We recall 
that when the sequence X n is a constant chain its law is described by 
means of the initial distribution Pi of X\ and the (one+step) transition 
probability (tr.pr.) 

P(x y S) = P(X m + l C ^ I X m = x)y X ^ Ry S ^ (Ry ^ = 1 , 2, * * * . 

We can and do select the tr.pr. to be regular; in other words, P{x y S) is 
a pr. in S for every fixed x and is a Borel function in x for every fixed 
S. Then the same is true of its iterates 

P n (x y S) = P[X m+n CS\X m = x] 

and 

S) =Jl >m (x y dy)P n {y, S) 

P n S = S). 
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The chain is stationary if, and only if, the initial distribution is invari¬ 
ant under the tr.pr” that is, under translations: P n = P，n = I ， 2， • • • • 

From now on y we assume that the X n form a stationary chain (P n = P y 
n = l y 2 y • • •) with regular tr.pr• P{x y S) and invariant initial distri¬ 
bution P. 

Since the stationary chain is described in terms of the common distri¬ 
bution P of single r.v.’s X n and the common conditional pr.’s P n (x y S) 
of events defined on single r.v.’s X m+n given X my the limit properties 
of the chain are essentially related to those of single r.v/s. All the more 
so since the invariant events defined on the chain are a.s. defined on 
any single r.v.，say X'. For, every invariant event C is defined on X ny 
X n+U - - - whatever be n and, because of stationarity, chain dependence, 
and martingale convergence theorem, 

P Xl C = P Xn C = P x nc — I C 2i 、 s. 

Thus, the inverse image of the sub cr-field Q of Borel sets S such that 
P(x y S) = IsM = 1 or 0 according sls x ^2 S or x ^ S is equivalent to 
the cr-field of invariant events defined on the stationary chain X n \ by 
abuse of language, we shall call Q the cr-field of invariant Borel sets. 
We are ready to investigate the asymptotic behavior of the tr.pr/s of 
our stationary chain and follow Doob. 

A. Invariant tr.pr. theorem. There exists a tr.pr. P(x 9 S) such that 

(i) For every S and every x C Ns with PNs = 0 

n a；=i 

and 

S) = J ?{x, dy)P(y t S) = jP{x, dy)F( yi S) = f P(x, ^)P(y, S). 

(ii) For every S and every invariant S f 

PSS f = f P(dx)P(x y S). 

The equalities in (i) say that, except for x C Ns 、 the tr.pr.’s P(x y S) and 
P(x y S) are invariant under one another and that P(x y S) is idem potent. 
Property (ii) says that P(x y S) is a Radon-Nikodym derivative of P 
given 6. 
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Proof. Since the sequence I[x n c translations of the indicator 
I[x x c s] is defined on the stationary sequence X ny it follows that 

1 ^ a.s … • 

一 hxk c s\ ~ > /(^) invariant. 

W ksssl 

Therefore, upon taking the conditional exp.’s given X x and applying 
the conditional dominated convergence theorem, we have 

1 n as 
一 z P Xl [X k CS] E Xl I(S). 

Thus, denoting the limit by P(x y S) y we have 

丄 Z P\^ S) P(x y S) 

except for x C. Ns such that P\Ns = P[Xi C ^s] = 0. 

On account of stationarity, if is an invariant Borel set, then 

f Pm i ： p k (^s)} i ： p[x k c s, x x cs^] 

J S ， k^l J ^ k^l 

= - ZP[XkCS y X k CS f ] 

= P[X x c ss f ] = Pss\ 

Therefore, upon letting n — 災 and using the dominated convergence 
theorem, we obtain 

f P(dx)P(x y S) = PSS\ 

Js ， 

Since P(x y S) is a conditional pr. of the distribution P given the sub 
<r-field 6 of Borel sets in R y we can and do regularize it. Then P(x 9 S) 
is 6-measurable in ^ for every fixed S and the equalities in (i) follow 
from the fact that the indefinite integrals on Q of their terms coincide. 
This concludes the proof. 

The exceptional JP-null sets Ns of starting points x vary in general 
with the entrance sets S. The question arises how to recognize points 
which do not belong to the exceptional sets and to find conditions under 
which these sets do not vary with the entrance sets. We denote by 
P 9 n {x y S) the JP-singular part of P n {x y S) and by S n a ^-null set such 
that P n (x y S n ) =» P 9 n {x y R); S n depends upon n and x. 




[Sec. 33] 


ERGODIC THEOREMS 


93 


a. Singular tr.pr. lemma. For every fixed x C. R y the sequence 
P a n (x y R) is nonincreasing and hence converges. 

For, \{ m < n and So is the P-null set of points y for which P(X m = y; 
X n G S n ) > 0, then 

Ps n (x y R) = P n (x y S n ) 

= fp m (x y dy)P{X m = ^; G S n ) ^ P^(x y S 0 ) 

R) . 

We set P 8 (x y R) = lim P a n (x y R) = lim P n (x y S n ) and call it singular 
tr.pr. 

B. Vanishing singular tr.pr. theorem. For every x such that 
P 办， R)=0 and for every S y 

where P(x y S) is T-continuous and 

户卜 ，^) = J" dy)P{y, S). 

If P 8 (x y R) = 0 except for x G with PN a = 0, then moreover for every 
x N a and every S y P(x y S) is idempotent: 

S) =J^y)^(yy ^)- 

Proof. Fix ^ G [Ps(x y R) = = 0]. For m < n and arbitrary S y we 
have 

- Z P k (x } S)=-ZP k ( x> s)-hf l^^dy) |- n ZP\y,s)}- 

” 釦 麵 1 ” 先 asl J As=b1 J 

According to theorem A, as w —> ① the integrand converges to P(y y S) 
except on a P-null set Ns* 

Since the integration measure is ^-continuous for events in S m c y we 
can apply the dominated convergence theorem for the integral taken 
over S m c . As for the integral taken over S my it is bounded by S m ) 

= P 8 m (x y 幻 一 > 0 as w —» «>• Therefore, by letting « — > oo and then 
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m > 00 ， the limit assertion follows- If *S*o is a P-null set, then 

1 n 1 n 

S 0 ) = lim - X) P k ( x i ^o) ^ lim - Pa k (^ y R) = 0 

” /csssl 打 hsss\ 

so that P(x y S) is P-continuous. Moreover, by letting « — 尤 in 

1 m+n ril n ) 

- Z P%, S) = \\^Z P k ^ dy)P^y,S)\^ 

打先 ssm+l l 打 A=1 J 

we obtain 

S) = fP(x,dy)P m (y, S); 

hence 

S) = f P(x, dy) [^ZP k (y,S)y 

Now, assume that the set of points x such that what precedes does not 
hold is a P-null set- Because of the P-continuity of P(x y S) y this ex¬ 
ceptional set is also null in the integration measure and we can apply 
the dominated convergence theorem. This yields idempotency of 
P(x y S) and concludes the proof. 

Corollary. If P 9 (x y R) = 0 for every x C. Ry then 

S) = f P m (x, ^y)P(y, S) 

This follows from the limit assertion in the foregoing theorem upon 
letting « 一 > oo in 

1 n+m /• f 1 n } 

一 E P k (x y s) = dy)\^Y,P k (y,S) • 

” /c=sl+nt ^ J 

C. Decomposition theorem. There exists a partition 

R = ZS t + N y T(zR y PN = 0 

t C T 

such that P(x 9 St) = 1 for every x S ty and pr. y s P t such that 

P(x y S) = ^s t (^vS y x ^ N y P t Bt = 1 

t c T 

and the Pt are invariant: 

PtS = JPt(dx)P(x y S). 
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In facty every pr. of the form 

PS = S^{dt) 

where /z is a pr. on a c-field in T is invariant^ and if an invariant pr. is 
T-continuous y then the converse is true. 

Proof. Since the Borel field © is generated by a denumerable field 
{*S(w)} of Borel sets (say, the field of finite sums of intervals with ra¬ 
tional extremities) and the conditional pr. P(x y S) given the er-field of 
invariant Borel sets is regularized, the decomposition theorem 26.2A ap¬ 
plies. This yields the asserted decomposition with invariant atoms St y 
and the asserted properties of P(x y St) and P(x ， S). As for the invari¬ 
ance of the P“ since, by theorem A, for every fixed S y 

P(^y ^) = J* P(^y ^ y ) P(jy S) y X ^ Nsy PN S = 0, 
it follows that 

PtS = JTt{dx)P{x y S) 

except for indices / G Cl 7 1 corresponding to sets St of total P-pr. 
zero. We add such sets for S = S(l) y S(2) y — to the P-null set of 
the partition so that the last equality holds for all remaining /’s and 
every S(n). Since the S(n) form a field, it follows as usual that the 
equality holds for all Borel sets S. This proves the invariance of the 
pr/s Pty and by integrating the invariance relation with respect to the 
pr. /z in / we find that the pr. P f is invariant: 

Ps = f ?yx)P( Xi s). 

Conversely, if is an invariant pr” then for every n 

PS = f P{dx) I- Z P\x y S )} 

J yn k^i i 

By theorem A, the integrand converges to P(x y S) except for ^ belong¬ 
ing to a P-null set. Thus, if is P-null, then the exceptional set is 
JP^null, the dominated convergence theorem applies and 
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Upon using the decomposition of P(x y S) y the converse assertion is 
proved, and the proof is concluded. 

Corollary. Every component chain { 户 “ P(x y 6*)} is stationary and 
indecomposable. 

*§34. ERGODIC THEOREMS AND L r -SPACES 

Let (fi, Ct, P) be our pr. space. In the usual ergodic theory the pri¬ 
mary datum is a one-to-one transformation T\ on fl to Q. In fact, the 
basic underlying concept is that of the inverse transformation Tf 1 
operating on sets and not on points: co C 

Tf 1 preserves all sets operations and is said to be measurable if it 
transforms measurable sets into measurable sets. But this is precisely 
what the translations along, say, a sequence X\ y JY 2 , • • • of r.v.’s do; 
they transform events defined on the sequence into events defined on 
the same sequence, that is, they transform the sub <r-field © of events 
induced by the sequence into itself. Once the translates of these events, 
hence of their indicators, are determined, they determine the translates 
of simple and then measurable functions defined on the sequence and, 
in particular, determine the sequence itself, given its first term. Thus 
the primary datum becomes that of translations of events; it is more 
general than that of point transformations. In the sequel, the cr-field 
of events to be translated is the whole cr-field d of events, but it may as 
well be any fixed sub <r-field ©, whether induced by a random function 
or not. 

34.1. Translations and their extensions. We say that a single¬ 
valued transformation T on the cr-field d of events into itself is a trans¬ 
lation (by 1) if it preserves (commutes with) all countable operations 
on events and preserves Q and 0: for any A y Aj C Ct 

r ⑷ = {JA)\ Tfl^ = fl TA h T U ^ = U TA h 

= TCI = Q, T0 = 0. 

In fact, it suffices that T preserve complementations and countable in¬ 
tersections (unions); for preservation of countable unions (intersections) 
follows by the de Morgan rules, while that of 0, hence of 0， and conse¬ 
quently of disjunctions and countable sums, follows by 

TQ 苗 T(J U A c ) ^ TJ U TA C ^ TJ U {TA) C - 
Thus, T translates the <r-field GL into a a-field T0L y then Td into T{TQ) 



[Sec. 34] 


ERGODIC THEOREMS 


97 


= T 2 d y and so on. The translation T k (by k = 1, 2, •…） is the 走 th 
iterate of T and T° = I is the identity transformation ： IA = A、A d 
Let ^ on fl be a measurable function. The translate by — 1 of ^ is 
the measurable function ^ (so that 专 i = 专 ） determined by 

⑴ [h CS] = T k ^ c S] 

for all Borel sets S Cl R; for, then, ^ assigns to any oj C a value 
x C. R by the correspondence 

^k(co) = X <^> O) C T k ~ l [^ = x]; 

in other words, any atom = x] of the sub <r-field of events induced 
by ^ is the translate by 是 一 1 of the atom = x] of the sub cr-field of 
events induced by ^ Conversely, letting r vary over the rationals ， 

T k ~ l [!i = y] = H P 一 1 [j < ^ < r] implies that 

r >y >8 

<^]= n n <r] = n < x]. 

r >s 

Let 9TI be the family of measurable functions on (S2, (2)，to be denoted 
by ^ with or without affixes ‘ Let I a be the subfamily of indicators of 
events* Translation T on d can also be considered as defined on I a 
(to I a ) y and relation (1) extends it to a transformation T on (to JJJl) 
which we continue to call a translation* We intend to show that this 
extension is linear; T{a^ + = aT^ + a f T^ and continuous; 

T(lim ^ (n) ) — lim T^ (n) . More precisely 

A. Extension theorem. Relation (1) extends the translation T on 
I a to a linear and continuous transformation T on 3H, with T1 = 1 and 
TIab = TIa 4 TIb* Conversely y the restriction of such a transformation to 
I a is a translation on I a ‘ 

Proof. The converse assertion is immediate. As for the direct as¬ 
sertion, it is obvious that T1 = 1; linearity follows from the relations 
below where a > 0 and r varies over the set of all rationals: 

[r (一幻 <x] = t[-^<x] = m > "^1 = \n > = i-n<x] y 

[ 取 ) <x) = T[ a ^ <x] = m = m < ♦】=^ 

[m + 4 w 1 < 丨 =ru [ 专 d <x^r] 

r 

= \J[n< r]\n f < X — r] [U + 7T < X 】； 
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continuity follows from the relations 

[T sup ^ (n) > x] = T U [^ (n) > ^] = U [7^ (n) > ^] = [sup 7^ (n) > x) 

which mean that T commutes with sup n , hence with inf n ^ (n) = 一 
sup n — ( 专 (n) ) and, consequently, with lim sup n = inf sup n and lim inf n = 
sup inf n « 

Corollary L Given a translation T on fc / a , the translate {by \) of a 

n n 

simple function X x k^TAky an ^ the translate of any ^ is the 

limit of the translates of any sequence of simple functions which converges 
to 

Corollary 2. A translation T on I a has a unique extension to trans¬ 
lation T on 911: a linear continuous transformation with Tl = 1 and 

ti ab = ti a - ti b . 

This follows from Corollary L 

34.2. A.s. ergodic theorem. From now on, T denotes a fixed trans¬ 
lation, and we set 





so that _ 

^ 

We denote by Q the sub cr-field of events invariant under T y to be 
called invariant events: TC = C, C G 6, set 

— _ 1 n 

P n A = ET^Ia =- H PT k ^J y A ca y 

and observe that P n so defined on ft is a pr. coinciding with P on Q. 

According to the definitions, the translates of events defined on the 
sequence of translates X n = T n ^ l X of a r.v. X coincide with the trans¬ 
lates along this sequence as defined in the preceding section. Thus, all 
propositions therein apply to such sequences. Yet, the primary datum 
being now the translation and not the sequence, the outlook changes 
and new problems arise: 

Find conditions to be imposed upon T under which the sequences T n X 
converge in some sense for {as large as possible) families of r.v*s. Fur- 
thermore^ find families for which these conditions on T are not only 
sufficient but also necessary for various types of convergence. 
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We begin by observing that, upon setting F n = 1 in the basic ergodic 
theorem, it yields 

a. Ergodic lemma. Let 

lim sup P n A — 0 as A 

Then T n X ― > TX invariant y for every r.v. X such that 

丄 ^ o and f T k ^ l X ^ 0, 

nJ niZi y 

as n — ① and then /丄 0. 

In particular^ T n X —> TX bounded、for every bounded r.v. X. 

The particular case follows from | X \ ^ c < and hence | T k X\ ^ c y 
so that the foregoing integral is bounded by c/n while the sum of inte¬ 
grals is bounded by cP n A. 

b. Invariant pr. lemma. The three properties below are equivalent: 

(i) lim sup P n A 0 as A [0. 

(ii) lim P n = T 5 exists and is a pr. on d. 

(iii) There exists on d a pr. P invariant under T and coinciding with 
P on the a-field Q of invariant events {and then lim P n = P). 

We shall denote by E e X the c.pr. of X given Q with respect to P y de¬ 
fined by 

= {xd? y CCQy 
Jc Jc 

which exists when JX dP exists. 

Proof. 1。 Since a finite measure is continuous at 0, (ii) implies (i). 
Conversely, if (i) holds, then, by the particular case of the ergodic 
lemma and the dominated convergence theorem, 

Y n A = f Y n I A dP ^ 


JTI A dP = AC a. 


Clearly, P so defined on d is nonnegative and finitely additive, with 
Pil = L Moreover, (i) becomes TA — > 0 as | 0, so that P is also 
continuous at 0. Thus, P is a pr. on Ct, and (i) implies (ii). 
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2。 If (ii) holds, then, as « —> 

I ?TA 一 T 5 /1 — I Y n TA - P^A I = -| PT n A - PA \ ^ 上 — 0 

n n 

so that P is invariant under T. Since for an invariant event C, PC = 
P n C = PC y it follows that P = P on Q. Thus (ii) implies (iii). 

Conversely, if there exists on d an invariant pr. P which coincides 

with P on 6, then, by the stationarity theorem, T u Ia E q Ia outside 
an invariant 戶 -null and hence 尸 -null, event and, by the dominated 
convergence theorem and P e = T 5 。 

Y n A = J T n I A 仆 —f E Q I a dP^ = J E e I A dP e = PA. 

Thus (iii) implies (ii), and the proof is terminated. 

A. A.S. ERGODIC THEOREM. Let 

lim sup P n A —> 0 as AI 0. 

Then P n — P invariant pr. on d with P = P on Q and 

(i) For every nonnegative r.v. X 

E e X 

while for every r.v. X 

E e X + - 於 X - 

on the invariant set on which the right-hand side generalized c.exp. exists ， 
outside an invariant null subset. 

(ii) I/JxdP exists or if the sequences 土 converge in pr. (a for¬ 
tiori y if they converge in the rth tnean) y then E e X exists and in the second 
case is finite y and 

E e X. 

Proof. The first assertion follows by the invariant pr. lemma. Since 
all sequences T^X are ^-stationary, assertion (i) follows by the station¬ 
arity theorem and the fact that an invariant P-nu\\ set is P-null. The 
first case of assertion (ii) is immediate and the second case follows from 
(i) by the fact that the limits of sequences of r.v/s which converge in 
pr, are r.v.’s; hence ^ 一 E e X~ a.s. exists, outside an in¬ 

variant null set. 
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So far, whenever P ^ lim P n appeared, we either proved or assumed 
that 戶 is a pr. Since, by Complements and Details, 19, of Chapter I, 
the limit of the sequence of pr.’s is a pr., the fact that i 5 == lim 
implies that P is a pr. Thus, in this subsection, we can drop the assump- 

^ on that lim P n is a pr. Then, the a.s. ergodic theorem yields at once 
the following 

B. A.s. ergodic criterion. The sequences T^X converge a.s. for 
every nonnegative r.v. X i/ y and only if 、 the sequences converge for 
every event A. 

The ergodic hypothesis corresponds to T being P-indecomposable, 
that is, the cr-field Q of invariant sets reducing a.s. to 0 and 0 or, equiv¬ 
alently, the invariant functions degenerating into constants (finite or 
not). 

C. Indecomposability theorem. The followitig pvopcvties ave eqtiiv- 
alent, 

(i) T n X —> TX degenerate for every X 0. 

(ii) T is P^indecomposable and P n —^ P. 

… 1 n a.s. ^ 

(iii) 一 D It 卜 1 a ― > PAJor every Ad 

n k^i 
1 n 

(iv) — X) P{T k ^ x A)B — > PA - PB for every pair A y B GL. 
n k^x 

Proof, (i) <=> (ii) by the a.s- ergodic criterion. 

(ii) (iii) by the a.s. ergodic theorem and the fact that P e A de¬ 
generates into PA. 

(iii) => (iv) by integrating over B with respect to P and using the 
dominated convergence theorem. 

(iv) ==» (ii) by setting 5 = 0 so that P and hence P = P on 

Q; then, setting /f = 5 = CC6so that PC = PC-PC = (PC) 2 , we 
have PC = 0 or 1. 

The proof is complete. 

34.3. Ergodic theorems on spaces L r . We can now attack the prob¬ 
lem of convergence a.s. or in pr. or in the rth mean of sequences T^X 
to a limit TX C L r for every X ^ L n r 1 • Since a point X C. L r is 
an arbitrary element of a class of equivalence and not a specific r.v., 
the transformation X —*> TX must not split classes of equivalence, 
that is, is to be a mapping on L r to L r . This is accomplished if we as¬ 
sume, and we shall do so in the sequel ，that the translation T is null- 
preserving ，that is, if PA == 0, then PTA = 0, For, then X = X f a.s. 
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implies that T^X = a.s.; hence TX = TX f a.s. 


We recall that || ^|| r = (J*| ^| r Y is the norm oi X Z L r {r ^ 1), 


and a mapping M on L r to L r is 


linear if M(aX + a f X f ) = aMX + a , MX , a.s., G R y 
nonnegative \i X ^ 0 a.s. ==» MX ^ 0 a.s., 

bounded (or normed) if || || r ^ c\\ X\\ n where ^ is a finite con¬ 

stant independent of X ^ L r ; the smallest of these constants is the 
norm || M || r of M, defined by || M\\ r = sup J| M^|j r /|[ ^|[ r 
for all Z ¥ 0 a.s. 


We drop the subscript r when r = 1. Also, denoting by “r” some 

C C 

type of convergence, we write M n —> M on LI when M n X — MX for 
every X L f d L r> and drop “on Z/” when L = L r . 

We require three properties of linear mappings, of which the second 
and the third extend at once to arbitrary Banach spaces (Banach- 
Steinhaus) and the first extends to partially ordered Banach spaces 
under a supplementary assumption on the norms. 

a. Linear mappings lemma. Let M y M n be linear mappings on L r to 

Lry r ^ 1. 

(i) If M is nonnegative y then it is bounded. 

(ii) If the M n are bounded and lim sup || M n ^|| r < for every X C 
L ry then they are uniformly bounded. 

(iii) If the M n are uniformly bounded and M n — M on the subspace 

r 

cf all bounded r.v.’s, then M n M on L r . 

Proof. 1。 Because of the linearity of M, to prove (i) it suffices 
to show that a nonnegative M is bounded on the subspace of all a.s. 
nonnegative X C L r . If M is not bounded, then there exists a sequence 
X n ^ 0 a.s. such that [| X n ||r = while || MX n |j r > n 2 . Thus 

X) || ||rA 2 < 00 ； hence ^ = X) X n ln 2 d 、 

while, by the elementary inequality {a + b) r ^ + ^ r , ^ ^ 0, 

r ^ 1, as ^ 00 

m( zx./k 2 )}^ E f (MX k /k 2 y >n 

Thus || || r = °°) the assertion follows ab contrario y and (i) is proved. 
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2° Let the linear mappings M n be bounded and let lim sup || M n X r 
< oo for every X G L r . To prove that all || M n || $ r < ①， it suffices 
to show that all || M n X , || r S 〆 <oo for all points X’ belonging to 
some sphere S = [^:| X f - X 0 \ < j]. For then, if || ^|| < j, we 
have || M n X|| r = || M n {X + Xo) — M n Xo r ^ hence for any 
XCL r 

II X|| r / sX \ 2c\. M 

00 

But L r = \J L r m y where L r m is the closed set of those points X for 

W 3 * 1 

which all || ^ m. Since the space L r is complete, it follows, 

by Baire’s category theorem, that at least one of the L r m is of second 

category and, consequently, contains a sphere S we were looking for. 
Assertion (ii) is proved. 

3° Let all || M n || r ^ ^ < 00 and let M n ^ M on L^. For every 

XCL r + X n y where X f - XI l{ x \ <k] and X n = XI U x 

Then, by linearity of the M n and completeness of L ry as m, w > <», 
then 务一 > oo, 

I M m X - M n Z|| r ^ || M m X ， 一 M n Z , || r + || H M n X n \\r 

^ || M m X ， 一 M n X f \\ r ^2c\\X ff \\ r ^ 0. 

Assertion (iii) follows.. 

In what follows, c denotes some finite positive constant independent 
of n. 

A. Zrr-ERGODic theorem. The following implications hold on L r to L r 
with r ^ 1: 

T n ^ t =» -fH 4 f <=>T n T <=> lim sup || r 1 || r < oo ^ 

苁 _ 1 _ 1 _as- 

T^\\r ^ C ^ ^ ^ cPr ^ ^ P ^ cP' r ^ T 1 T on L^. 

Proof. 1° The implications 

^ j 171 —> t 七 T n — > T 

r 

require no proof. i 

T ==» ― > J 5 ^ cP r =» — > T on Zqq. For, T being ob- 

r 

viously a linear and nonnegative mapping on L r to L ry the linear map- 
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pings lemma (i) applies, so that || T\\ r ^ c and, by the dominated 
convergence theorem, 


P n A 


r _ 

jPI A 






dP 


(J (TIaY dPy ^ c y (I A ) r dP^ 


cP;A. 


Therefore, lim sup I^A 
the same theorem, T 91 一 


r 


0 as 10 and, by the ergodic lemma and 
Ton Lao. 


T 71 T T 71 —> T with T = E Q . For, then / m —> T 5 ^ cP r 

_ p 一 

while ^ TX^ for every X L r and hence, by the a.s. er¬ 

godic theorem, T 71 — > T with T = 

2。 7 =» lim || || r ^ c on For then 尹二 T on 

and hence || 产 || r — || Tj| r on and we can take r = || T|| r . 

lim sup || T n r ^ ^ on ==» T 71 —> T. For then, on the one 

i — r _ 

hand, P = lim sup P 11 ^ cP r and hence, by 1 °, 7" n —> T on L^ y lim sup 
become lim, and II T^JIr —> II TIL on L^; on the other hand, by the 
a.s. ergodic theorem, T^X —^ E e X for every nonnegative r,v. X. Thus, 
to prove the assertion, it suffices to show that, for 0 ^ X C. L ry we 
have E e X C L r . But, setting X m = XI[\ x \ <m] so that 0 ^ X m | X as 

m oo and X m C have, by what precedes, J(E e X m ) r JP ^ 

c r J{X m ) r dP. Letting 历 ■_ 00, it follows, by the monotone and con¬ 

ditional monotone convergence theorems, that 


J (E e X) r dP 


^ c r \ (X) r dP < «). 


Thus, E^X C L ry and the proof is complete. 

3° 7 ^ A T ^ || || r ^ For, then, clearly for every X C L ry 

TX y T 2 X, … G L r and hence T^X C L r . Thus every 7^ is a mapping 
on L r to L r and, these mappings being obviously linear andjionnegative, 
the linear mappmgs lemma (^applies. Therefore, every T 71 is bounded, 
while lim sup | f^XWr * || TX\\ r < «> and hence, by the linear map^ 

pings lemma (ii), all || T 71 || r ^ c. 
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T n r ^ c P n ^ cP r ==» P n —> T 5 ^ cP r . For, then 

P^Ia = f T^Ia dP ^ (J (T^Ur dpj ^ c ^J (lA y dP y = c p7 Jy 

and hence x 

lim sup P n A ^ cP r A —> 0 as A 

^ — l 

Thus, by the Invariant pr. lemma, P n —> J 5 ^ cP r . 

|| T^^T. For, then T n T on and, hence, 

by the dominated convergence theorem, T n T on L 浓 、 the linear 

一 j* 

mappings lemma (iii) applies, and T n T. The theorem is proved. 
B. Z r -ERG0DIC CRITERIA. The following equivalences hold on L ry r ^ 1. 


A.s. ergo die criterion: 

- Ji r-I— P — 1 ■_ 2L«S« — || II IIM 

T n —> T <!=> T n —- > T <=> lim sup || 7" n 11 r ^ c on L^. 

Mean ergodic criterion: 


And for r = 1: 





T sup 


y»n 


^ C 


争 




T 71 ― > 7" <=> sup P n ^ cP 


Proof. According to the Z r -ergodic theorem, it suffices to prove that 


---- HS -- . r 「 2L«S« MV 今 … ■ '■ I I 

P n ^ P ^ cP T n T and P n ^ cP ^ || T n \\ ^ 

The first implication follows by the a.s. ergodic theorem from the in¬ 
equality 



X d? ^ 




dP < 


which holds by hypothesis for all indicators X = Ia ； hence, as usual, 
it holds for simple r.v. and then for any r.v. X L. Similarly, the 
second assertion follows by the Z r -ergodic theorem from the inequality 



T n X\dP ^ 




X dP 


which holds by hypothesis for all indicators and hence for all r.v.’s 

XCL. 





106 


ERGODIC THEOREMS 


[Sec, 35] 


Remark. In this subsection, the translation T was assumed to be 
null-preserving, that is, PA = 0 implies that PTA = PT 2 A =.••=()• 

Thus, every P n is P-continuous and, if P n —^ T 5 pr. on (i, then P is P- 
continuous. In other words, we can select nonnegative integrable r.v.’s 

p n and f (with Ep n = Ep = 1) such that 

k. 

T n A= ^dP y ?A =fp^P> ^ca. 

The reader is invited to play around with these r.v.'s and the results 
of this subsection. For example, the following relations hold: 

E e ? = E e p= 1 a.s., TX^ E G X = (pX) a.s.; 

P n ― > P pr. <= p n — ^ p ^ p n ― > p* 

P n ^ cP <=> p n ^ c a.s” P ^ cP <!=^> p ^ c a.s. 

* §35. ERGODIC THEOREMS ON BANACH SPACES 

The implication || || r ^ || - TZ|| r 0 for every 

X 〔 L t remains meaningful for transformations T on a Banach space 
B of points X to itself, provided the norms are interpreted as norms in 
B. The question arises whether a similar ergodic implication can be 
obtained for Banach spaces and this without reference to an underlying 
pr. space. It is to be expected that some supplementary condition will 
be required; for sequences of bounded or uniformly bounded transfor¬ 
mations have compactness properties in L r that they do not have in 
general Banach spaces. We shall follow Yosida and Kakutani. 

35.L Norms ergodic theorem. Let T be a linear transformation on 

— 1 n 

a Banach space B to itself, and set T n = — ^ T k ^ 1 . Let X y with or 

without affixes, denote a point in B and let c be some finite positive con¬ 
stant independent of n. 

We say that a sequence X n converges strongly to X and write X n 

^ X y if the convergence is in norm, that is, if || — X — > 0. If 

there exists a transformation T on B to B such that T n X —> TX for 

every X ^ B y then we write T n — T. A sequence X n converges 
weakly to X if f(X n ) —> /(X) for all bounded linear functionals / on 
B. The term “weakly” as opposed to “strongly” is justified by the 
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fact that X n 一 、 X implies X n ― > X y for 

I 肌 ) -/(X) I = \/(x n -Z)| ^ \\/\\-\\x n -x\\. 

The null-transformation^ to be denoted by 0, maps every X ^ B onto 
the null element d ^ B. 

A subset B 1 is range of a transformation T ( on B to B if T f maps B 
onto and we write B f = T ， B. 

B f is linear if it is closed under all (finite) linear combinations of its 
elements; observe that if T ( is linear, then T f B is a linear subspace. 

B f is weakly {strongly) closed if it is closed under weak (strong) pas¬ 
sages to the limit; observe that a strongly closed linear subspace is a 
Banach space. We denote the strong closure of B f by . In fact, the 
strong closure of B f is also its weak closure. For, if X n C B f and X n 

X y then X ^ B / implies that the distance d of X to B / is positive. 
But, by Corollary 2 of the Hahn-Banach theorem, there exists a func¬ 
tional / such that 0 = f(X n ) —> f(X) == ^/ > 0, and we reach a con¬ 
tradiction. 

B’ is weakly {strongly) compact if every sequence in B’ contains a 
weakly (strongly) convergent subsequence; observe that strong com¬ 
pactness implies weak compactness. 

a. Convergence lemma. Let all \ \ T n || ^ c. Then 

(7-/)7^ = T^iT- I) A 0 

and 

r 1 0 on (T - ijB. 

Proof. Since for every X B 

|| {t ^iyr^xW - || T^iT-^XW T n X -X\\ -^0, 

the first assertion is true. Since; given € > 0, for every X e {T — I)B 
there exists an X 9 C (T _ T)B such that || X — X’ || < € and there 
exists an X n C B such that X , = {T — T)X n y it follows that, as ” 一 > <» 
and then € — 0 ， 

|| T^X || ^ || W || + || T^iX - X f ) || 

^ II T^{T- I)X ,f || + « — 0 ， 

and the second assertion is proved. 

b* Norms ergodic lemma. Let a// || !T n || S Then every weakly 
compact sequence T n X converges strongly to a point invariant under T. 
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Proof • If a subsequence T n， X —> Z as then {T 一 I)T n， X 

W 一 

— > (T — T)X; hence, by the first part of the convergence lemma ， 
{T 一 I)X — B y that is, TX == X. Therefore, T n X = X whatever be 
n and, setting X — X (X — X) y it remains to be proved that 

T n (X 一 X) — 0. Because of the second part of the convergence 
lemma, it suffices to prove that X —又 e {T — 7)5. But 

— / ti 一 1 ti 一 2 1 0 \ 

(r n — P)X = {T -1)( - / + - r + • • • + — T n ^ 2 ) x 

\ n n n / 

so that all (T 1 - I)X CiT - I)B y while a subsequence (W - I)X 

—> X — -Y as It follows that 又一 X e、T — I)B y and the 

proof is concluded. 

A. Norms ergodic theorem. Let ^// 11 7" n 1 1 ^ c. If all sequences 
T n X are weakly compact、then 7" n —> T linear and 

II r|| ^ c, TT = TT = TT - T. 

Proof. According to the norms ergodic lemma，every sequence T n X 
converges strongly so that the passage to the limit is a transformation 
T on 5 to By obviously linear and of norm bounded by c. Since，on 
account of the convergence lemma ， 

0 二 T^iT 一 /) A T(r — = (r 一 I)T y 


it follows that 



TT, 


and the proof is concluded. 

The set B\ of all points X such that TX = X^Tis said to be the proper 
subspace of T corresponding to the proper value \ of T (X real or com¬ 
plex), provided this set does not consist of the null-point 8 only. Thus, 
T 一 = 0 on 7*^ {0}，and X is not a proper value of 7" if, and only 
if ，TX = implies that X = 6. Since，in this section, T is bounded 
and linear，every B\ is, clearly, a strongly closed linear subspace of B. 

Corollary. In the norms ergodic theorem 、 T 9^ 0 if and only tf\ = 1 
is a proper value of T y and then TB = B\. 

This follows by the implications 

TX = X 々 T^X = Z => TX ^ X 


and 


fX ^ X ^ TX = TfX = TZ = X 
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B. Extended norms ergodic theorem. Let all || T n || ^ c and let 
a// sequences T\X be weakly compact with T\ = T/\ where \ with or 
without affixes have modulus 1. Then 

(i) T\ n A T\ linear, 

|| || = T\T\ = T\T\ = T\T\ = T\ y = 0 for X 〆 入 ’， 

T\ 7 ^ 0 \ is a proper value of T y and then B\ = T\B. 

m 

(ii) If T f = T — Xy7"x ; * with all T\ ; 0 and all Xy distinct, then 

y ® l 

m 

T n = r n + Z Vh Tx 7 r = rr x； = o, tv = vt = rr, 

y=i 

\ is a proper value of T ㈡ \ is a proper value of T and all 

Proof. The norms ergodic theorem and its corollary, with T re¬ 
placed by T\ = T/\ y yield directly properties (i) except for T\T\^ = 0, 
入 〆 入 ’• The latter follows from 

%%，二 = ~( z (XVX)^ 1 } Tx ^ o. 

Properties (ii) follow from (i) by elementary computations; only the 
last one deserves a proof. Let all T x； ^ 0 , the 入 y be distinct, and X 9 ^ 0 . 
If TX = \X and all X ； 5 ^ X, then, by (i), T\jX = T^T^X = 0 , and 
forming T f X it follows that TX = TX = \X. Conversely, if T f X = 
\X y then 

tx = rrz / 入 = ttx/\ = rrz / 入 = 入 z; 

moreover, if" 入 / = 入， then by (i) and the second relation in (ii), 

0 〆 z = T^x 二 f h x = 5%rz/ 入 = 0 ， 

and the converse follows ab contrario. 


Remark. The condition (i) sup T n < 00 implies that (i') 
sup T n < 00 and (i") T n ||/« —> 0. In fact, the foregoing proofs 
and hence the propositions remain valid if (i) is replaced by (i’）and 
(〆’)• Furthermore, if all sequences T n X are weakly compact, then it is 
easily proved that sup T n <00 (use the Banach-Steinhaus parts of 
the linear mappings lemma extended to Banach spaces). Thus, in the 
norms ergodic theorem, sup 11 T n 11 < 00 can be replaced by || T n \ \/n 
—> 0. This weakening is due to Dunford. 
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As for the converse, let us only mention that, if T n > T y then，by 
the same lemma, sup || T n || < oo. Also, if the Banach space is re¬ 
flexive, that is, can be identified with its second adjoint (for example, 
if it is a space L r with r > 1), then the condition sup || T n || < im¬ 
plies that every sequence T n X is weakly compact. 

35.2, Uniform norms ergodic theorems. We recall that in this sec¬ 
tion all mappings are bounded linear ones on B to B. 

The weak compactness assumption in the norms ergodic theorem is 
fulfilled, under the condition that all || || ^ c y if T is weakly {strongly) 

compact ， that is, maps the sphere of points X with || X\\ S 1 onto a 
weakly (strongly) compact set. In fact, as we shall see, it suffices that 
T be quasi-weakly {strongly) compact^ that is, that there be an integer 
h such that T h = 7 + JV y where V is weakly (strongly) compact and 

JV \\ S w < L And it is to be expected that in the quasi-strongly 
compact case the norms ergodic theorem can be made more precise; 
it will become the uniform norms ergodic theorem toward which this 
subsection is directed. This theorem was first given by Krylov and 
Bogoliubov for Markov chains. 

Observe that, if M\ and M 2 are weakly (strongly) compact and M is 
bounded, then M\ + M 2 as well as MM\ and MxAf are weakly (strongly) 
compact. Therefore, when T is quasi-weakly (strongly) compact, then 
for every integer k 

(Q) T hk ^ V k + W k y II W k \\ ^ w k y w <1 } 

where Vk = {V + 一 W k 、being a finite sum of terms with at least 
one weakly (strongly) compact factor V y is itself weakly (strongly) 
compact. 

A. Let ^// 11 T n 11 ^ r and let T be quasi-weakly compact. Then the 
conclusions of the extended norms ergodic theorem are valid. 

Proof. It suffices to prove that all sequences T n X are weakly com¬ 
pact. We use repeatedly the relation (Q), denote by / an arbitrary 
bounded linear functional on B y and set Y mtP = pT v X/m. Thus 

_ t> _ m — p - 

T m X ^ - T P X + - - T p T m — p X, m> p y 

m fn 

will become 

T m X = Y mt hk + ^ feYrn’m —从 + W k Y m%m ^.hky m > 

Since Vk is weakly compact, there exists a subsequence Y n ^ VkYm n ， m H —hk 
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such that/(y n ) —/(K)，where Y k is some point which may depend 
upon k. Since 

rhb 

I ^ - H/IMI % II — 0 

m n 

and 

it follows that 

lim sup \/(Y n ) -f(7 k ) I g cw k \\/\\.\\X\\. 

By selecting successive subsequences for k = 1 ， 2 ， … and applying the 
diagonal procedure, we may assume that this inequality holds for all k 
whatever be /• Since, on account of a corollary of the Hahn-Banach 
theorem, there exists an / such that \f(7k) — /(F；) | = || ||, 

it follows that, as 走 ， / —^ oo ， 

limn + — o 

so that Yk ^ y C. B and，whatever be /, 

lim sup \/(Y n ) -f{Y) I ^ cu^W/W-WxW + \/(Y k - y) | — 0. 

Thus f(Y n ) f{Y) y and the assertion is proved. 

To pass to the quasi-strongly compact case，we require the following 
properties. 

a. If B f is a strict Banach subspace of B y then for every t C (0, 1) there 
exists an X B such that 

|| JT || = 1, X — X r || ^ 1 — € for all X f d B f . 

Proof. There exists a point Y C 5 such that d = inf II Y — Y f II >0 

Y f CB f 

and hence, given €’ = de/1 — e y there exists a point Yq C B f such that 
d^\\Y-Y 0 \\<d+ Therefore, [f X = (Y - Y 0 )/\\ Y - Y 0 \\ 
and X G B f y then Y f = Y 0 + \\ Y - Y 0 \\X f C B f and 

IUII = i, 

II 义一 r II = || y - r||/|| y - y 0 || ^ d/{d + = l - t . 

A set S of linearly independent points of B generates {or spans) a 
Banach subspace，to be denoted by B(S) y if all points of the subspace 
are linear combinations, or strong limits thereof, of the points of S; 
when 4? is a finite set ， B(S) is said to be finite-dimensional. 
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b. If X\ y X 2 、• • • are linearly independent、then there exist points 
y n G B{X U •••, Z n ) such that II y n || = 1 and || X m - Y n \\ ^ ^ for 
every X G B(X\ y … ， X m ) and m < n. 

It suffices to apply the foregoing lemma with 6 = ^ to the strictly in¬ 
creasing sequence of Banach spaces B{X\ y •… ， X n ) C B y starting with 

Y t = X x /\\ X x ||. 

c. Let T be quasi-strongly compact. Then: 

(i) No sequence of distinct proper values of T may converge to a limit 
X with I X I ^ 1, so that the number of distinct proper values of T of modu¬ 
lus 1 is finite and they are isolated proper values. 

(ii) The proper subspaces B\ of T with | X | ^ 1 are finite-dimensional. 

Proof. Because of (Q) we can assume, without loss of generality, 
that T h = V + where V is strongly compact and || fV\\ 〈士 . 

1° Suppose there is a sequence X n — X of proper values X n , and 

I X I ^ 1. Then there is a sequence X n 9 ^ 6 such that TX n = 

If the X n are not linearly independent, then there exists a smallest 
integer n such that X u … ， X n are linearly independent and X n ^i = 

n 

^2 CkXk with at least one Ck ^ 0. But then 
>* 1 

n n 

H Ob ( 入 n+1 — ^k)Xk = 了 ( 尤 1 + 1 — H <^kXk) = 0, 

so that X\y •… ， X n are linearly dependent and we reach a contradic¬ 
tion. Thus, the X n are linearly independent and, by b, there exist 

n 

Yn = Hc k Xk such that II y n || = 1 and || ^ - F n || ^ 2 ^ or every 
X G B(Xu • • • ， X m ) and m < n. Let Z n = Y n /\ n h and observe that 

n — 1 

Yn - T h z n = J^dkXic so that II T h (Z m - Z n ) || ^ Since the se- 

quence Z n is uniformly bounded, there exists a subsequence Z f n = Zk n 
such that V\Z’ m — Z f n ) — 0 as m and hence 

II T\Z' m - Z f n ) II ^ II V{Z' m - Z' n ) II 

+ ||^|| (I \k m ~ h I + I ^k~ h I) — 2 " m 2 

It follows that 2II W\\ ^ Since || W | <+， we reach a contradiction 
and (i) follows. 
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2° If the proper space B\ with | X | ^ 1 is not finite-dimensional ， 
then it contains a sequence X n of linearly independent points, the argu¬ 
ment in 1° applies with ^ X for every n y and (ii) follows. 

d. Let all || T\\ ，n ^ r’ < oo and let T f be quasi-strongly compact• If 
T has no proper value of modulus 1 , then there exists two constants c x and 
C 2 C (0, 1) such that for all n 

II 7-II ^^i(l 一 Q) n . 

We give only an indication of the proof. To begin with, T has no 
proper values X with | X | > 1 . Otherwise, TX = \X for an Z 〆 0 
hence || T fn X = X n . X — oo as w ― oo，and this contradicts 

II T fn X I 刍 〆 X \ y c f < 00 . Since T f has no proper value of modulus 

1 ， it follows, by the preceding lemma, that it has no proper value 入 
with 入 ^ 1 — 7 , for some 7 > 0 . Now, it is easily seen that it suf¬ 
fices to consider the case T f = V + W where V is strongly compact 
and II W II 刍忉 < 1. It can then be proved that T f — \I has an in¬ 
verse for j X ^ max (1 一 y y = 1 一 ^ 2 , so that the series / + 

H T f \ n converges {T\ = T/\). Therefore there exists a c x such that 

II 叫 I “(1 - ( 2 )' 

B. Uniform ergodic theorem. Let all || T n || ^ c and let T be 
quasustrongly compact. Then: 

(i) The conclusions of the extended norms ergodic theorem hold. 

(ii) T can have only a finite number m of proper values of modulus 1 
{and has no proper values of modulus greater than 1 ). 

For every \ of modulus 1 there exists a finite constant c f such that for all n 

Wl-T^W^c'In, 

T x 9^0 i/, and only if、some \j = X, and every 7 \ 7 maps B onto the 
finite-dimensional proper sub 5 pace of T. 

m 

(iii) If T f = T — 2 then T is quasi-strongly compact and the 

J.5SS 1 

II T fn j are uniformly bounded, in fact there exist constants C\ and C 
( 0 ， 1 ) such that for all n 

I r n \\ 以 (1 -r 2 )' 
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Proof. Since a quasi-strongly compact T is quasi-weakly compact ， 

(i) follows by theorem A. Together with lemmas c and d, (i) implies 

(ii) and (iii) by elementary computations. For (iii)，observe that 

m 

I r n \\ ^ || r n || + i ： || f x; || g (m + \)c = ^ 

m 

while, if T h = V + JV and V f = V — YL then || T h — ^ || = 

|| r ||. For (ii)，observe that 

— _ — m /I n 

t 卜 Tx = r\ + Z(V 入)％ ) - A ， 

i«i k 雜 l / 

where [| T fn — 0 faster than any fixed power of \/n y T\ = 0 or TV 

• 1 n ； 
according as all Xy 〆 入 or a Xy = 入 ， and - [ (Xy/X)^ = 0(1/”) or 1 

n k^i 

according as 入 / 〆 入 or \ = 入 . One may also (take 入 =1 to simplify 
the writing) observe that 

T 1 - T = (I - 7)7^, (T n - I)/n = (T - 1)(1 - T)T\ 
so that 

|| (r 一 7)(7^ - r) || = || (T n - I)/n || ^ (r+ \)/n; 

then examine boundedness of (T — /) 一 1 on Bi c . 

35.3. Application to constant chains. Let P n (x y S) be an w-step 
transition pr.f. from a point x ^ R into a Borel set S d R. It is a Borel 
function in x for every fixed S y a pr. in S for every fixed x y and 

卜， S) =f P m (x y dy)P\y y S) y m ，《 = 1 ， 2,… • 

It is easily checked that: 

1. The space of all complex valued <r-additive functions <p of bounded 

variation on the Borel field in is a Banach space B when the norm 
|| ^ || of is defined by || || = Yar ip. 

2. The transformation T with kernel P{x, S) y defined by 

<p — > Tip = J(p{dx)P{x y S) 

is a linear transformation on 5 to 5 of norm 1; its wth iterate T n is 
given by 

p — TV = f <p(dx)P n (x y S) 


and has norm 1 and kernel P n (x y S). 
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We say that the kernel P(^, *9) and the chain are quasi-strongly compact 
if T is quasi-strongly compact. The uniform ergodic theorem, where ， 

obviously, we can replace. T n = — T k 一 1 by - T k without changing 

the conclusions, yields at once, on account of uniformity in all <r-addi- 
tive functions of bounded variation, the following result. We set 

声 A ， S)=-Z P k (^ S)/\ k y pj(x y S) = P^(x y S), 

and omit the subscript if it equals 1. 

Let the chain be quasi-strongly compact. Then: 

(i) There exists only a finite number m of proper values of modulus 
1 with cqrresponding finite-dimensional proper subspaces of the 
transformation. 

(ii) For every \ oj modulus 1, there exists a constant c f such that for 
all n 

sup I P\(x y S) - S) I 4 c’jn 

x,S 

and 

fp n (x^y)P x (x,S) = f Px(x, dy)P n (y, S) = S) 

f P\(x y dy)P\'(y y S) = 8 X yP\(x, S). 

(iii) P\(x y S) 9^ 0 ify and only if, some \ 

m 

P n (x y ^ = E s) + P m (x y S) 

y=i 

with 

f P j {^dy)P ,n {y,S) = J P >n (x, dy)?^, S) 

and constants C\ and c 2 G (0, 1) such that for all n 

sup I P fn (x y S) S (i(l — ( 2 ) n . 

x，S 

Let us examine the case 入 =1. On account of (ii ) 入 =1 " a proper 
value of the transformation^ and P(x y S) is a transition pr.f. which，for 
every fixed x y is an invariant element of the Banach space, with 

sup I P n {x y S) — S) < c f /n. 

x、S 
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It suffices to observe that from the last relation it follows that P(x y S) 
is a Borel function in ^ for every fixed S and a pr. in S for every fixed 
x. Thus, it is a nonvanishing solution (P(x y R) = 1) of the proper 

value 1 equatio ” ⑺ =/_ p (以) . 

Observe that = (一 whatever be G 5, let <p f C and 

set <p A <p f = <p — (<p — <p’)+ = <p f —(〆 一 <p)+. Nonnegative <p and 
<p f are orthogonal if 八 〆 = 0 or, equivalently, if there exist two dis¬ 
joint sets S y S f such that <p(S) = <p(Q) and <p f (S f ) = <p f (12) • Clearly, 
(p+ and are orthogonal and so are (p — <p A and (p f — <p A 〆• 

There exists a finite number l of mutually orthogonal invariant pr. y s Pi 

such that a pr. P is invariant i/ y and only if 、 

i i 

P = H piPiy Pi ^ 0, Yl Pi = 1 • 

t 338 1 t 338 1 

Consider the family of all mutually orthogonal pr/s Pi which are 
invariant, that is, are solutions of the proper value 1 equation. Since 
mutual orthogonality implies linear independence and the proper space 
B\ is finite dimensional, the number of such solutions is finite. The “if” 
assertion follows and it suffices to prove that if a pr. P is invariant, then 
it can be written in the asserted form; to simplify the writing, we drop 
the bars over the P’s. 

First, we show that Qi = P A Pi = piPi* We exclude the trivial 
case Qi = 0 and set P\ = Qi/\\ Qi\\^ If the pr. P\ does not coincide 
with then Q\ = (Pi — P f i)^ and Q n % = {Pi — P f i) do not vanish. 
Since Q f i/\\ Q f i \\, Q ff i/\\ and the Py with j 9^ i y form a family of 

/ + 1 mutually orthogonal invariant pr.’s, we reach a contradiction. 
Hence, Qi = p%Pi where, necessarily, 0 ^ pi ^ 1. 

It remains to show that Q = P — piPi = 0- Either pi = 1 and 
then P A Pi = hence P = P {； or pi < 1 and then 

(1 - Pi )((P - Q % ) A P x ) ^(P- Qi) A (1 - Pi)Pi 

=(P - Qi) A (Pi - Qi) = 0, 

hence (P — Qi) A Pi = 0. It follows that every Q A Pi = 0, so that, 
if Q ^ then Q/\\ Q || and the Pi form a family of m + 1 mutually 
orthogonal invariant pr.’s. We reach a contradiction and the assertion 
follows. 
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i 

There exists a finite measurable partition ^2 Si = R such that 

i 

= 1, P(x y S) = ^2 ^Si(^)PiS y P(x y Si) = 1 for x C Si ； 

tasl 

and for all n 

sup I S) - FiS I < c’ jn. 

x^Si,S 

These relations imply that, if the system starts at any point of 知 then 
it remains a.s. in S{ and the uniform limit is independent of the start¬ 
ing point. 

The proof goes as follows: Since P(x y S) is an invariant pr. for every 
fixed x y we have 

P(x y S) = Z pi(^)PiSy pi(x) ^0, PiM = 1- 

Since the P{ are linearly independent and P(x y S) is invariant under 
P{x y S) and under itself，it follows that 


piM = 介抽 ) P/W = %• 


By setting Si = = 1] and S f n = x; — 

. . L n 

the last relation yields 


刍 1 一 pi{x) < - , 

rt- 


J^i{dx)pj{x) ^ PiSi + PiS f n =PiR 




But the equality holds only if every PiS f n = 0. Hence F{Si = 1, and 
the other assertions follow. 

Let us mention a few classical cases in which P(x y S) is quasi-strongly 
compact and invite the reader to check it. 

/. Finite constant chains with transition pr. matrix P 2 k y j } k = l y …， /• 

2 . Take P{x, S) — j p(x y y)tx{dy) y where ju is a pr. on Borel sets S and p(x y y) 

is a bounded nonnegative Borel function on R X R; the second Iterate is strongly 
compact. 

3. Take P h (x y S) ^ cP\y y S) for some integer h and all x y y } S. 

4. Take P h (x y S) ^ cjiS y where /jlS is a pr., for some integer h and all x y S. 

5. Take P h (x y S) ^ 1 — e for /jlS ^ e y where nS is a pr. and € > 0， for some 
integer h and all x y S. This is the Doblin condition and encompasses the pre¬ 
ceding ones. Form 

P h {x y S)=^f Mxy yWy) + P. h (x y N X S) 
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where po(x y y) can be selected to be nonnegative and Borel measurable on R X R y 
and N x is the ;u-null set on which the ^singular part of P h (x y S) does not vanish. 
Then set p(x y y) = min (po(x 9 y) y 1/e) and observe that 

S) = fp(x } y)n{dy) + Q(x 9 S) 

with 0 S Q(^ y 6 1 ) ^ 1 — € for all x and S. Thus T h = T f + Q where T f has 

for kernel J* p(x ， y)ix{dy) and Q has for kernel Q{x y S) y hence || )2 || ^ 1 — c. 

In the expansion of — (T f + Q) k . all terms containing T’at least twice are 
strongly compact and there are 々 + 1 other terms of norm 刍 (1 一 € 产一 1 • Thus, 
for k sufficiently large 

I r 从-(走 + 1)(1 - # - 1 < i 

and every V k is strongly compact. Thus, Doblin’s condition implies quasi¬ 
strong compactness and the foregoing results apply. 


COMPLEMENTS AND DETAILS 

In what follows T operates on <31 to Ct and / operates on 12 to 
厶 Let T be null-preserving. To every P-invariant event A there corre¬ 
sponds an invariant event A f such that AA tc + A c A f is null. Example: 
lim sup T n A. 

2. Let Q = [0, 1), P == Lebesgue measure, g integrable of period 1. 

( 太 ） ^ g(x + kc) —> I gdx y c irrational fixed; 

=-fg dx. 

(In the first case introduce /(*v) = f modulo 1; in the second case introduce 
t{x) = 2x modulo 1. Show that the transformations preserve the measure and 

are indecomposable.) 

J. Construct examples of non P-preserving T such that for every X L y 

T n X C L. For instance, take suitable T such that T 2 = / on 

Q == [0, 1) with P = Lebesgue measure; in particular, take suitable linear trans¬ 
formations of [0, c) and [c 9 1) into [0, 1). , 

4. Let Y\ y Y^ y - - - be a stationary sequence of r.v.’s. There exists a sta¬ 
tionary sequence • • • X—\ y Xo 9 X\ y * • • such that the laws of {X\ } X^y * * •) and 
of (Y u y 2 , • • •) are the same. (Take, for every finite subfamily of X's y 

Ju(Xk l9 * * *, X krf ) = £(y^ + 办， • • •， Yk m ->rh) 

where h is so large that the subscripts of Y's are positive, and apply the con¬ 
sistency theorem.) Since all pr. properties of the sequence Y u V 2y are 
the same as those of X\ y X^ y * * *> w e can replace in what follows every Xk wit 

k > Oby Y k . 
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If £| | r < oo for an r ^ 1, then 

E{Xl I * * • , -X*—n+l) ^ £(-^1 I Xf)y X • • •) = U. 

& 

(Observe that the left-hand side r.v.’s form a martingale sequence.) 

If £| | r < oo for an r g 1， then stationarity of • • • X^ h X 0> X h • - entails 

£{E(X n ^ x I 不， … ， X n )} ^ £(U) 

uM = 士 Z E(X k+ i \X U • • X h ) U. 

n k^l 

(By stationarity £{E(X n+1 \X l} •••, X n )\ = 川 £ 叫不， … ， X_ n+1 )}. 
For every Z G L ry set || Z || = (JlW and observe that 

II C7 ⑻ II E(X k+1 1 ".，； 一 C4 +1 || 

w A；- 1 

+ ~ Z) (u k+ i — u) 

n fc 職 l 

where the are the translates of U by k. By the stationarity assumption, 
the sequence C/i ， C/ 2 ，••• is stationary and, since E\ U \ r < E\ X\ | r < oo, the 
second right-hand side term converges to 0. The first one reduces to 

lt\\E(X l \X^-^X. k+l )^U\\ y 

w a ；- i 

every term of the sum converges to 0 as 4: ^ oo, and so does their arithmetic 
mean.) 

5. Let 12 be a compact metric space, <2 the minimal or-field over the class of 
open sets, PA > 0 for every open set A. Let T == / _1 be P-preserving, inde¬ 
composable, and the t n be equicontinuous. 

If 兄 on n is continuous and integrable, then g n —> J The same holds if, for 

every € > 0, there exist continuous bounds g f ^ g ^ g n with JWI <€• 
Application. If ^ on i? of period 1 is Riemann integrable, then 

1 n—1 / •! 

gn(x) = - ^ g(x + kc) — > I gdx, c irrational fixed. 

W k «o *^0 

(Replace, in by a circumference of length 1.) 

6. lim sup P n ^ cP does not imply P n ^ cP for every n. What are the 

consequences of this statement? A counter example: Let 12 = {co n }, fi = 
where A n consist of 2 n+1 points. Denoting by 1, 2, • • .， 2 n+1 the points of a 
fixed Any set for every A n jJL{k) == 2 k ^ x or 2 2n ~ k or 1 according as 1 ^ ^ « or 

n < k ^ 2n or 2n < k ^ 2 n+1 , and set t{k) = 2 n+1 or 々 一 1 according as 是 =1 

or 1 < ^ ^ 2 n+1 . Finally, set PA = f 
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7. Let Q be the set of irrationals in ( 0 , 1 } and let P = Lebesgue measure. 
Let t(x) — \/x modulo 1 . Show that with the usual notation for continued 
fraction 


， w = ， (「Si+ 「3 +rW+...) = nk+r 為 + … 

(a) The transformation T = / -1 is P-indecomposable. 

(Let A be invariant with p = PA < 1. To prove that p = 0, take a fixed 
xo set c n (xo) ― c ny and denote by a/b and a!lb’ the {In — l)th and 2nt\\ approxi¬ 
mations of xq } so that 


1 I , , 1 ) 1 似十 〆 

y - - 1 - h I - h j - r - = T _TT/* 

Cl I C2n-l \ X + C2n bx + b 

Then t 2n+x (y) = x; hence Ia(x) = I a (y) - Set a = ~ and /3 = : • Then 


- b\b + b f ) f:I A (x) 


{bx + b f ) 2 


As xo varies over 12 and ti over all the integers, the intervals (a, /3) form a Vitali 
covering of Q and, by Lebesgue，s density theorem, p = 0.) 

(b) Let \s.A = -——- I -- • Then T = 厂 1 is ju-preserving. (This follows 

l0g2jAl+X r dx ， ^rUn dx ， 、 

from /x(/ _ 1 yf) = y.A for all A = (0, x) by I t —~7 ^22 I / 


d x ’ 一 y' f 1 ’ 71 \ 

1 + 〆 ^l/(n+z) 1 + ^ / 


(c) The null sets and the integrable functions g are the same for ju and P , 
the a.s.-ergodic theorem holds for T = / -1 and both measures, and 


1 n 一 1 
W k M *0 


1 r 1 gM 

log 2 J 。1 x 


dx. 


Applications. 1° For almost all ^ C ^ and every integer p, the frequency 

1 (p + l) 2 

ofpinUnWliSj—log^^. 

(Take g = Ia where A is the set of all x such that c\(x) = p.) 


ci(x)c 2 (x) - - - c n (x) 


a.s. 


n( 


k) 


log n/log 2 


(Take g(x) = log ci(x).) 

1 Jt, 、 a.s. 

3° - £ c ^( x ) —■> + 00 * 

»jfe-i 

(Take g(x) - d(x).) 
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We consider sequences, and more generally random functions, formed 
by r.v.’s whose second moments and hence mixed second moments are 
finite. 

Their second order properties are those which can be expressed in 
terms of these moments. Up to equivalences, the r.v/s in question can 
be interpreted as points in a Hilbert space, and such spaces are a 
“natural” generalization of euclidean spaces for which all the classical 
tools were developed. Thus, it is to be expected that the study of sec¬ 
ond order properties ― to which this chapter is devoted — will only require 
analytical tools similar to the familiar ones. In fact, except for a few 
concepts and properties, this chapter is practically independent of the 
preceding ones. 

§ 36. ORTHOGONALITY 

We examine in this section the elementary properties of orthogonal 
r.v/s. The concepts become almost obvious if geometric intuition is 
used; we expect the reader to do it. The only difficulties consist in the 
justification of the intuitive conclusions in the nonfinite case, and in the 
use of pr. concepts such as a.s. convergence which have no direct geo¬ 
metric equivalent. 

Let (Q, % P) be a fixed pr. space. Let X y Y y •…， with or without 
affixes, be second order r.v. } s y in general complex-valued: 

E\X\ 2 <oo y E\Y\ 2 < 00 , …， 

so that by Schwarz’s inequality their mixed second moments EXY 
exist and are finite. The ‘‘bar’’ means ‘‘complex-conjugate •” 

The space L 2 of equivalence classes of such r.v.’s is a Hilbert space: 
equivalent r.v.’s represent the same point ，and EXY defines the scalar 
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product of the points represented by X and Y. Scalar products deter¬ 
mine the norms || Z|| = (E\ X\ 2 Y a = (EXX) y \ and norms determine 
distances || X — Y\\. Convergence “in norm” is convergence “in quad- 

raticmean” Z: || — Z|| 0 ^ E\X n - X\ 2 ^0. The 

• . q.m. q.m. 

space L 2 is complete in the sense that X n - > X <=> X m — X n - > 0, 

m y n y —> 00 . 

36.1. Orthogonal r.v.’s; convergence and stability. X and Y are 
orthogonal, and we write X 丄 F，if EXY = 0. In particular , 义丄 I 
if, and only if, E\ X 2 0 y that \s y X 0 a.s.; in fact, X = 0 a.s. is 
orthogonal to every Y. Since independent r.v.’s in are orthogonal 
when centered at expectations, we assume, for sake of analogy, that 
all our r.v.'s are centered at expectations ， unless otherwise stated. Since 
our r.v.’s have finite second hence first moments, they can always be 
so centered and the assumption made does not restrict the generality. 
Then E X 2 \s the variance of X and EXY is the covariance of X and Y. 

From JtT 丄 F it follows, upon expanding, that E\ X Y\ 2 ^ EX 2 
+ E\ Y | 2 . More generally, if X\ y 不 ， ♦ • • are orthogonal r.v/s, then 
(Pythagorean relation) 

E\ZX k \ 2 ^ZE\X k \ 2 y ” = 1,2 ，…， 

Jc^s^X JcsssX 

and, as 打一 > ①， 

E\ZX k \ 2 ^ z E\ X k I 2 . 

允 =1 ks=tl 

Since, by the mutual convergence in q.m. criterion, the sequence of sums 

n n n 

X Xk converges in q.m. if, and only if, E\ ^2 Xk 2 = X ) 五 | A 2 0 

k^=l kssm A:=m 

as m y n —> 00 , we have 

A. Convergence and stability in q.m. Let the r.v. y s X n be orthog¬ 
onal. 

(i) The series ^ X n converges in q.m. i/ y and only i/ y E\ X n 2 < °°; 
and then E\ X 2 = ^ £| X n j 2 {Pythagorean relation). 

E\ X n 2 ▲ 1 71 q.m. ^ 

00 — t~2 ~ < °°> T °°> then — ^ 0- 

夕 71 夕 71 kss\ 

The second assertion follows from the first by Kronecker’s lemma. 

The foregoing properties correspond to those of the case of independ¬ 
ence if independence and convergence a.s. are replaced by orthogonality 
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and convergence in q.m., respectively. In fact, not much more is 
needed in the case of orthogonality to obtain an inequality (Rademacher) 
similar to that of Kolmogorov and, consequently, a.s. convergence and 
stability theorems well-known in theory of orthogonal functions. 

n 

a. If S n = are consecutive sums of orthogonal r.v. } s y then 

n 

五 (max I A I) 2 ^ (log 4^/log X k \\ 

hSn k=l 

Proof. For n \ the inequality is trivial. For ^ > 1, let w be the 
integer such that < n ^ 2 m y set X\ ^ Xk or 0 according as 

k^n or n<k^ 2 m , and assign X\ to the point of abscissa k. Divide 
the interval (0, 2 m ] into intervals (0 r and (2 m — 、 2 m ], each of these 

two intervals into two halves, and so on ； the elements of the {m — j) th 
partition are of length 2’ and j = 0, 1, • • • , w. Every interval (0, h] is 
the sum of at most m disjoint intervals each of which belongs to a dif¬ 
ferent partition; in other words, we have the dyadic representation of 

m 

h in geometric terms. We can write Sh = ^2 Yjh where any Yjh is sum 

y »o 

of the r.v.’s belonging to the interval of length V which may or may 
not figure in the representation of h y so that some Yjh may vanish. It 
follows, by the elementary Schwarz inequality 

m m 

JL aj I 2 ^ (w + 1) ^2 I a i\ 2 y 

y » o j *o 


that, whatever bt h ^ n y 

/sa=0 

where T is the sum of all r.v.’s | Yjh | 2 as j and h vary. But the expecta¬ 
tion of the sum of all those r.v.’s | Yjh | 2 which belong to the)th parti¬ 
tion is Z E\ X k I 2 , so that ET^ (m+ l) Z E\ X k \ 2 . Therefore 

k 篇 1 k^l 
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b. If S n = ^2 Xk are consecutive sums of orthogonal r.v. y s and 
E b n E X n 1 < °°> T 00 > then S n ― > S and there extsts a subsequence 

&_S« 

S Uk —> S such that y for every integer k y b Uk be the first b n > k. 

Proof. The hypothesis implies that £| | 2 < ① so that A applies 

q.m. 

and S n ― > S. 

Let 奶 

r n ^ E\S - S n \ 2 ^ Z E\ Xj I 2 

j’sstn+l 

so that 


ZE\S - S nk \ 2 ^Zr nk ^Zk(r n 


k 一 r n* +1 ) + lim kr nk+l 

k 一 > °o 


^ZKE\X n \ 2 


and, hence, S ni 


S on account of the Borel-Cantelli lemma and the 


Tchebichev inequality. 

B. A.s. convergence and stability. Let the r.v's X n be orthogonal. 

(i) IJYl, log 2 ”£| X n \ 2 < oo, then the series ^ X n converges in q.m. and 
a.s. 

/log ^\ 2 0 A 1 ^ „ as. 

(ii) // E (-T-) ^1 I 2 < oo, t oo, then ~ ZX k —> 0. 

71 . q.m. . 

Proof. Let S n — Yi ^k- Under hypothesis (i), S n — > S according 

k^l 

to A, and lemma b with b n = log ”/log 2 yields S 2 k — > S; thus (i) will 
follow if we prove that Tk = max I S n — *5*2* I —> O. But lemma 

2 k ^ n < 2 k+l 

a, with n = 2 k+1 一 2 免 = 2 k y yields, by elementary computations, 

E 五 I 7\ I 2 S (3/log 2) 2 Z \o^nE\ X n \ 2 < ^ 

/^s»l 71 

and the assertion follows by the Borel-Cantelli lemma and the 
Tchebichev inequality. This proves (i), and (ii) follows by Kronecker’s 
lemma. 
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Corollary. If the series ^2 X n of orthogonal r.v. y s converges in q.m” 

n 

then ^ Xk — o(\og n) a.s. 

This follows by A(i) from B(ii) with b n = log n. 

36.2. Elementary orthogonal decomposition. Let {Xj} be a count¬ 
able family of (second order) r.v/s. We intend to show that the X/s 
are linear combinations of orthogonal r.v.’s. 

Without restricting the generality, we can exclude those X/s which 
are degenerate at 0 or are linear combinations of other r.v.’s of the 
family. Let S be an arbitrary finite subset of indices j and let Cj be 
complex numbers. If 52 CjXj = Q a.s., then 

i€S 

E( E c^Xk^ E cjEX^h = 0, hCS. 

j cs j CS 

Conversely, this set of relations implies readily that E ^2 CjXj 2 = 0 

. jcs . 

and hence X ^0 = 0 a.s. On the other hand, if there exist some 

j cs I I 

nonvanishing ry such that the foregoing set of relations holds, then the 

determinant || EXjXh = 0, j y h ^2 S. Thus, what we have excluded 
is the possibility for such determinants to vanish. It follows, by ele¬ 
mentary computations, that the r.v/s Y\ = X 、 and, for n > l y 




x 2 

… X n 

Vn = 

EXA 

ex 2 X x 

… EX n X x 


EX x X n ^ 

• • • • • 

EX 2 In^l 

* zr v v 

• • • xSAn^n—1 


are nondegenerate. Furthermore, they are linear combinations of the 
X n and are easily verified to be mutually orthogonal. Upon setting 
6= y；/(£| y y | 2 ) H , we have E^h = 5y^(= 1 or 0 according as j — h 
or j 9 ^ h) and 

Xj = Cjh^hy c jh = EXj^h* 

hsssal 

In general, r.v/s ^ such that E^h = Sjh are said to be orthonormal. 
Given a finite or denumerable set of orthonormal r.v.’s 专 i ， 专 2 ， • • •，we 
can set 

n 

X * X f n + X) c k^k, Cic = EXlk* 

km»l 
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Then = 0 for k ^ n and, by the Pythagorean relation, 

E\X\ 2 ^E\ | 2 + E|^ | 2 ^EU.| 2 . 

/^a=l = l 

It follows, by letting ” 一 •> oo, that, given an orthonormal sequence 专 n , 

Z\c n I 2 ^ E\ X\ 2 < oo 

so that, by A, the series ^ c n ^ n converges in q.m. and hence X f n X f 
orthogonal to every 专 n . Thus, we obtain the orthogonal decomposition 

X ; +Z Cn^n a.s., r n = EXl ny 

with 

E\X\ 2 ^ E\X^\ 2 +Z\cn I 2 . 

The r.v. X\ being orthogonal to all f n , is orthogonal to all linear com¬ 
binations of the and, hence, to their limits in q.m. This leads to the 
introduction of linear subspaces; we recall that here all r.v.’s under con¬ 
sideration are of second order. 

A linear subspace £ is a family of r.v.’s closed under formation of all 
a.s. linear combinations of its elements. If, also, £ is closed under 
passages to the limit in q.m., then it is a closed linear subspace. Given 
a family {X t } of r.v/s, the linear space £ 0 \X t ) of all their linear com¬ 
binations and the closed linear subspace £{Xt} y the closure of £ 0 {^} 
under passages to the limit in q.m., are said to be generated by the r.v.’s 
If we keep only those X t which are linearly independent, we obtain 
a base of £ 0 {Xt} and of £{Xt \. 

A r.v. X is orthogonal to £, and we write X 丄 £， if X 丄 F whatever 
be y G For example, in what precedes, an arbitrary r.v. X was 
decomposed into X ’丄 and X c n^n G £{^n}. This orthogonal 

decomposition with respect to £{{ n } can be generalized for arbitrary 
closed linear subspaces, as follows : 


a. Let & (Z L 2 be a closed linear sub space. To every X C 乙 2 there cor 

responds an Xo G £ such that E\ X — Xo\ 2 = inf E X — Y\ 2 = 

y 

and then X — Xo 丄 £• 


a, 


Proof. Let {X n } C £ be such that E\ X — X n \ 2 —> Since 

E\X m -X n \ 2 = 2E\X m - X\ 2 


2E\ X- X n \ 2 -4E 


Xm Xn 


X 


2 


2 
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and (X m + X n )/2 G it follows by letting m y n — ① that 

0 ^ E\ X m - X n \ 2 ^2E\ X m - X\ 2 

2£| X 一 X n ^ 一 4a —*> 4a 一 4a = 0. 

Therefore, by the mutual convergence criterion and the closure of £ 
under passages to the limit in q,m., X n -， ?• 」 》 X 0 G <£> and 

E\X- X 0 \ 2 = lim E\X- X n \ 2 = a. 

Since Xo + cY G £ whatever be y G £ and the complex number c y we 
have E\X- (X 0 + cY)\ 2 ^ a. Thus, by taking r = bE{X - X 0 ) Y 
with0<^<2/£|y| 2 , we have 

0 ^ E\ X - {X 0 + cY)\ 2 - E\ X - X 0 I 2 

= b{bE\ Y\ 2 - 2)| E(X - X 0 )T\ 2 ^ 0. 

Hence, E{X — Xo)Y = 0 and the proof is terminated. 

A. Projection theorem. Let & be a closed linear subspace. For 
every X there exists an a.s. unique orthogonal decomposition 

X f + X"， JT 丄 JE ， Z" C 么 

Proof• There exists such a decomposition: according to a it suffices to 
set X n = Xq. The decomposition is a.s. unique, since \(X = X\ + X f \ 
is another such a decomposition, then X’ 一 X\ = X’\ 一 X" and 
尸一 JTi 丄 X' - X\ so that X f = X\ a.s. and X n = X n x a.s- 

Corollary. If {^}, E^ t f = St t r y is an orthonormal base of the closed 
linear subspace £, then y for every X y there exists an a.s. unique orthogonal 
decomposition 

X = X f + 22 c t^tjy 尤’丄 

Proof. It suffices to prove that if X n C £> then there exists a count¬ 
able subset of indices j such that X N = ^ Set Ct = EXh so 

that, whatever be the summation set of /’s ， 

0^E\X" -Y, Ct^t I 2 = E\ X" I 2 — E k I 2 ， 

and hence 

Thus, there can be only a countable number of nonvanishing c ty and 
the assertion is proved. 
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36.3. Projection, conditioning, and normality. If X = X f + X n 
where X’ 丄 £ and X n G we say that X f is the perpendicular from 
X to £ and X n is the projection of X on £. According to the projection 
theorem, the perpendicular X’ and the projection X" exist and are 
a.s. unique whatever be X and the closed linear subspace £. We denote 
the projection by I and if £ = £{Xt} we also write it 

£ 2 (^l {^}). The reasons for this notation similar to that of condi¬ 
tional expectations are many. The operation of projection plays the 
role of a ‘‘second order’’ conditioning, for, as is readily verified, it is an 
a.s. linear operation and has the smoothing property of the operation of 
conditioning if cr-fields are interpreted as closed linear subspaces. 

Furthermore, if £(®) is the space of all (B-measurable square in- 
tegrable r.v.’s then E(X | ®) = E 2 (X \ £(®)) a.s” since, for J? C (B, 
E{E(X\(R)I b \ = EXI b = E{E 2 (X\ £((B))/ b }. Also, this similarity is 
related to normality. To begin with, observe that \( X = Y + H 、 
where Y = (RX and Z = 3X with same affixes as X if any, then 

EXT = EYY f + EZZ ， 一 i(EYZ ; - EZY ; ) 

EXX f = EYY f - EZZ 1 + i(EYZ f + EZY ; ). 

It follows that y 丄 f and Z 丄 f if, and onlyjf, EXX f = 0 and 
EXX f = 0 (if X or X’ is real-valued, then EXX f = 0 implies that 
EXX f = 0). 

Let the r.v.’s X y X f be jointly normal, that is, let the nv.'s Y y Z, Y\ Z f 
be jointly normal. If X and X’ are orthogonal and real-valued, hence 

mmmmm 

EXX f = EXX f = 0, then they are independent, since 

2 /2 

log Ee i{uX+ufXf) = - — EX 2 - — EX /2 = log Ee iuX + log Ee iu，x \ 

2 2 

Similarly, if X and X’ are orthogonal and complex-valued and 
EXX f = 0, then they are independent, that is, the pairs {Y y Z) and 
\Y f y Z ; \ are independent. 

We shall say that a family of r.v.’s Xt = Y t + where t with or 
without affixes varies on some set T y is strongly normal^ if it is normal 
(that is, all finite subfamilies of the r.v.’s Y ty Z ty are normal) and if 
either all the Xt are real-valued or all the EXtX^ = 0. 

A. Within a strongly normal family、orthogonality is equivalent to inde¬ 
pendence and projection is equivalent to conditioning. 
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Proof* The first assertion follows from the foregoing discussion. 
As for the second assertion, let X y Xt y t C. T y form a strongly normal 
family. Since X’ = X — E 2 (X \ {X t }) = X — c ii X tj is orthogonal 

j 

to and strongly jointly normal with every Xt y it follows, by the first 
assertion, that X f is independent of every X ty hence E(X f | {H)= 
EX’ = 0 a.s. Therefore 

E{X\ {X t }) = E(E c ti X ti I \X t }) = I c ti X ti = E 2 (X\ {X t }) a.s., 

j 

and the proof is concluded. 

We shall see in the next section that, to any family of second order 
r.v/s, we can make correspond a strongly normal family with same 
second order moments• Thus, a projection can always be considered 
as a conditioning within suitably selected normal families. Further¬ 
more, to every concept in terms of conditional expectations corre¬ 
sponds a second order concept in terms of projections which, according 
to what precedes, coincides with the specialization within suitable nor¬ 
mality. Let us give two examples • 

The concept of a martingale \X n ) becomes 

E 2 (X n I 不，…， X n ^ x ) = a.s. 
or, setting Y n = X n — X n ^\ y 

E 2 (Y n I y l5 …， D = o a.s., 

that is, Vn 丄 Yk for k < n. Thus, a ‘‘second order martingale’’ is 
simply a sequence of consecutive sums of orthogonal r.v.’s and the 
results of 33.1 apply. As is to be expected, the a.s. properties of mar¬ 
tingales become properties in q.m. of second order martingales. For 
example, 

a. If the sequence X n is a second order martingale^ then E 2T n | 2 丁 ； 

if、moreovety lim E X n < °°， then X n ― ~> X and X closes this mar¬ 
tingale. 

n 

Proof. It suffices to write X n = ^,Yk where the are orthogonal 

n 

and hence 五 I X n \ 2 = | 2 t • moreover, lim £| Xi | 2 = 

00 k 通' 

D 五 |n| 2 < oo, then by A, X n X. Furthermore, X n — 丄 Xk 

A；xasl 
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iov k S ni ^ n and letting « oo, it follows that Z — 丄 ^ for 

^ hence E 2 {X | - - ^yXm) = X m a.s. This proves the last asser¬ 

tion. 

The concept of a chain \X n ) yields 


E(X n I &，•••， D = E{X n Z m )a.s 


for w 〈”， ” = 2 ， 3， • • • • A “second order.chain” is defined by re¬ 
placing E by E 2 , (In the strongly normal case, both concepts coincide.) 

Set r mn = E(X n X m )/E\ X m | 2 or 0 according as E\ X m | 2 > 0 or = 0; 
then E 2 (X n | X m ) 


b. A sequence { X n ) is a second order chain if y and only if 、 


^mp = ^mn^npy ^ < W <C p* 

Proof. The relation is trivially true if E\ X m | 2 = 0. Otherwise, if 
E 2 (X p I X u - - •, X n ) = E 2 (X p I X n ) = r np X n a.s., 

then •丄 for in ^ ti hcncc 


tmpE X m 2 = E(^XpX m ) = Tf^pKiXfiXr^) — T mn T n pK\ X m | 


2 


and the asserted relation holds. Conversely, if this relation is true, 
then, for every m < n y 

r np E(X n I m ) = E(X p X m ); 

hencc 」.， that is ， 

r np^n = E2^Xp I X n ) = E2^Xp \ X\ y # * *> X n ) a.s. 

The proposition is proved. 


§37. SECOND ORDER RANDOM FUNCTIONS 

Second order stationary random functions were introduced by Khint- 
chine (1934) who gave the harmonic decomposition of their covariances. 
Slutsky (1937) obtained a first harmonic decomposition of such random 
functions. Kolmogorov (1941) proceeded to a detailed study of second 
order stationary random sequences by means of Hilbert space methods. 
Cramer (1941) extended Khintchine’s results to the vector case and 
(1942) obtained a decomposition theorem in functional spaces, essen¬ 
tially equivalent to the harmonic decomposition of second order sta- 
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tionary random functions; this decomposition is also an immediate con¬ 
sequence of Stone’s theorem (1930) on groups of unitary operators in 
Hilbert spaces. All this research is limited to second order stationarity. 

The author formulated a calculus of general second order random 
functions and gave (1945-46) the results of this section; they contain 
as special cases the second order stationarity properties (the foregoing 
decomposition was stated there explicitly for the first time). 

37.1. Covariances. We indulge in the usual abuse of notation by 
using the same symbol for a function and for its value. The argument 
/, with or without affixes, will vary over a fixed set T. The only require¬ 
ment is that the operations performed on the elements of T be mean¬ 
ingful; this will always be so if T = R = ( 一 + 00 ) and, to make his 
life easier, the reader may assume that T = R y unless otherwise stated. 

A random function X(t) on T is the family of r.v.’s {X{t) y / C T}; 
in general, the r.v/s will be complex-valued. According to the essen¬ 
tial feature of pr. theory, pr. properties are those which are described 
by the consistent set of laws of all finite subfamilies of the r.v/s X(t). 
Conversely, according to the consistency theorem, every such consistent 
set of laws is the law of some random function. 

A second order random function X{t) on T is a family of second order 
r.v.'s: E\ X{t) | 2 < oo whatever be t ^ T. Without restricting the 
generality, we can and do assume that the second order random func¬ 
tions under consideration are centered at expectations, unless other¬ 
wise stated. Then the second moments E X{t) 2 are variances and 
the function defined on T X T by 


rx(/, n = EX(t)W) 

is, by definition, the covariance of the random function X{t) on T. Ac¬ 
cording to the Schwarz inequality, this covariance exists and is finite. 
Conversely, if 1^(/, /’) on T X T exists and is finite, then E X{t) 2 = 
rx(/, /) < oo, i C. T. Thus, second order random functions can be 
defined as those having covariances. Their second order properties are 
those which can be defined or determined by means of covariances. It 
is to be expected that to a covariance corresponds more than one ran¬ 
dom function. For example, the covariances of the random functions 
X{t) and Y{t) = rjX(t) on T y where ?? is a r.v. independent of all X(t) y 
t C. T y with £| ?7 1 2 = 1, coincide. In fact, this example shows that 
our convention which consists in centering second order random func¬ 
tions at their expectations is immaterial. If we take Erj = 0, then the 
function EX{t)X{t f ) where the random function X{t) is not centered 




132 


SECOND ORDER PROPERTIES 


[Sec. 37] 


at its expectation is still a covariance of the random function Y{t) 
centered at its expectation: 

EY{t) = Er,X(t) = E v EX(t) = 0, 

EY(t)V(n = E\ v \ 2 X(t)X(n = E\ v \ 2 EX(t)X(n = EX{f)Xin• 

Since we consider only first and second (mixed or not) moments, it 
is natural to try to make correspond to a covariance a random function 
whose law is determined by these moments only. This will be done in 
the next theorem. We require the following definition. 

A function r(/，/’）on T X T is of nonnegative-definite type if, for 
every finite subset T n d T and every function h{t) on T n 

I ： r(/, t f )h{t)W) ^ o. 

t f V C T n 

Then it is her mitt an y that is, r (/，/’）= [(/’，/)• For, with T\ = {/} 
and h{t) = 1, we have r(/, /) ^ 0; then, with T 2 = {/, t f \ y the expres¬ 
sion r(/, /’) 々 (/) 石 (/’） + r (/’， /)A(/’)A(/) is real, and hermiticity follows 
by taking h{t) = 1 , h{t f ) = 1 , The reason for the terminology is 
that a nonnegative-definite type function r (/， /’）= /( / — / 0 which 
depends only upon the difference « = / — /’ of its arguments reduces 
to a nonnegative-definite function /(«)• We require the following 
lemma. 

a. Let j y k vary over 1, • • •，《 and let the Uk vary over ( 一 《， +<»)• If 
Q{u) = Y, rn jk UjU k ^ 0, m jk C ( 一 °°， + 00 ), then there exist jointly nor¬ 
mal real-valued r.v. y s Xk (some of which may be degenerate at 0) such that 
nijk = EXjXk. 

Proof. According to the classical properties of quadratic forms, the 
assumption means that 

Q{u) = R{v) = X) W, o'k 2 = 0) 

where the a’s are linear combinations of the u } s. But e~ RW/2 = 
JJ 厂 W /2 is the joint ch.f. of independent normal r.v/s Yk (centered 
at expectations) with <r k 2 = EY k 2 . Therefore, by going back to the 

u 9 s y 

R(v) = E(Z Y k v k ) 2 = E(Z X k u k ) 2 = X) ^ikUjU ky 

where the X y s are linear combinations of the Y y s hence are jointly 
normal, and EXjX k = The assertion is proved. 

A. Covariance criterion and normality. A Junction r(/, t f ) on 
T X T is a covariance if y and only if 、 it is of nonnegative-definite type. 
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And every covariance is also the covariance of a strongly normal random 
junction which can be selected to be real-valued when the covariance is real- 
valued. 

Proof. Let r(/, f ; ) = /, t f C T. Then 

2: r(/, = e 2 ： x{t)x{nh{ t )m 

t f t r ^ T n t f t f ^ T n 

=E\ 2 ： X{t)h{t) I 2 ^ 0 

t C Tn 

and the “only if” assertion is proved. 

Conversely, let r(/, /’） on T X T be of nonnegative-definite type. 
The “if” assertion means that there exists a random function X(t) on 
T such that EX{t)X{t f ) = r(/, t f ) on T X T. By hypothesis, 

Q{u y tO = I ： |r(/ ， t ， 灣 m i Q 

t,t f C T n 

or, setting h{t) = u t 一 iv ty 

Q(u y v) = X) J {(R r (^ 〆)( 《 必 ’+ v t v t ^) - 3r(/, 〆 ）(_，一 u v vi) } ^ 0 

tX C Tn 

whatever be u ty v ty t C T n . Therefore, by a, the function f(u y v) = 
j $ a normal ch.f. of 2n r.v.’s (centered at expectations) Y{t) 
and Z(/) corresponding to u t and v ty respectively, with 

EY{t)Y{t f ) = £Z(/)Z(/0 = 皆虹 (/ ， 〆 )， EY{t)Z{t f ) = - 士 3r(/，/，)• 

It follows by setting X{t) = Y(t) + iZ(t) that EX{t)X{t f ) = 0 and 

EX(t)I(n = r(/ ， o，/，/，f 7V , 

The normal laws of finite subfamilies of r.v.’s X(t) so defined for 
every T n C T are consistent, since the law for T m C T n coincides with 
the marginal law on T m obtained by setting u t = v t = 0 for 
t G T n — T m . Thus is defined the law of a normal random function 
X(t) on T with covariance r(/, t f ) on T X T 1 - If the covariance is real¬ 
valued, we can simply set EX{t)X{t f ) = T(/, / r ) and the ch.f.'s Q(w . 0) 
determine the law of a real-valued normal random function X{t). The 
proof is concluded. 

Corollary. The real part of a covariance r(/, /’) is a covariance ， 
while the imaginary part is not a covariance except when it vanishes. 

Proof. The first assertion follows from the above proof by (Rr (/，/’）= 
E\y/2 Y(t) • y/2 Y{t f ) }. The second assertion follows from the fact 
that 3r(/, /) = 0 so that, if 3r(/，/’）= EX{t)X{t f ) on T X T y then 
E X{t) | 2 = 0 on T y hence X{t) = 0 a.s. and 3r(/,/’）= O. 
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We examine a few operations which preserve covariances (see also 
the following subsections), and leave to the reader the specialization of 
what follows to stationary covariances, that is, to nonnegative-definite 
functions. 

\ 

B. Closure theorem. The class of covariances is closed under addu 
tions y multiplications y and passages to the limit. 

Proof. Let 1\(/， /’）and r 2 (/，/’）be two covariances and let X\{t) 
and 不 (/) be random functions whose covariances are 1\(/， /’）and 
尸 2 ( ，， 〆) ， respectively* If we select these random functions to be orthog¬ 
onal, that is, 五尤 i(/) 尤 2 (/’）= 0 for /， /’ £1 了 ， then 

E\x x {t) + x 2 (t)}{x x (n + ^ 2 (/ 0 } = Ex x (t)x x (n + Ex 2 (t)x 2 (n 

= 『 1(/，/’）+ 以 /， /’)• 

If we select them to be independent, then 

E{x 1 (t)x 2 (t)}{x l (nx 2 (n\ = Ex l (t)x l (n^x 2 (t)x 2 (n 

= r x (i y ， )r 2 (/，，)• 

Such selections are possible, for it suffices to take the random functions 
X\{t) and 不 (/) to be normal on pr. spaces (S2 1} Gii y Pi) and ⑴ 2 , ® 2 , 户 2 )， 
respectively, and then form the product pr. space. Thus, the first two 
assertions are proved. 

Finally, let r,(/，/’）be covariances for J C ^ arbitrary with Sq a limit 
point of S (not necessarily in S) and let r〆/ ， 〆 ） —r(/ ， 〆）on T X T 
as s 一 > Sq. 

Since passage to the limit and finite summation on T n X T n can be 
interchanged, it follows by A that 

E E r(/，= lim E I ： Uty nh{t)h{t f ) ^ 0 

Tn Tn T n T n 

and by the same theorem r(/, t f ) is a covariance. 

Applications. 1° If r(/, /’) is a covariance, so is its real part and 
hence so is the real part /’)) 2 - (3r(/, /’)) 2 of r 2 (/，/’)； similarly 

for higher powers. 

2° Every nonnegative number being a covariance, so is every poly¬ 
nomial in covariances with positive coefficients, and so is every limit 
of such polynomials. For example, 1/(1 — //’）is covariance of a ran- 

00 

dom function analytic In (—1 ， +1)，where the 专 n are orthonor- 
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mal. This random function was encountered in studying new classes 
of limit laws and is at the origin of the investigations of this section. 

3° If r〆/ ， 〆 ） is continuous in s R and F(s) on R is nondecreasing, 

then Jr〆 /，n dF(s) is a covariance, provided it exists and is finite. 

4° Let Ah and be difference operators of step h operating on / 
and t’(eR)’ respectively. If r(/，/’）is the covariance of the random 
function X{t) y then, by computing the covariance of A^ n ^(/) where 
X{t) has for covariance r(/，/’)，we find that /’） is a covariance 

q2u 

and the variance A^ n A^ n r(/, /) ^ 0. It follows that if — ~^ r(/ ， 〆) 

dt n dt n 

exists and is finite, then it is a covariance; we shall see that it is the 
covariance of the nth derivative in q.m. of X(t). 

37.2. Calculus in q.m. ; continuity and differentiation. Let s y s f vary 
over some set S and let s 0y j’ 0 be limit points of S; they do not neces¬ 
sarily belong to S (for example, S = ( — °o, + 00 ) and Sq = +°o). 

a. // X 8 > X as s s 0 and X f X f as / 一 ^ / 0 ， then 
EX,X, f ^ EXX\ 

Proof. This follows from 

- XT) = E(Xs - Z)(JV - 

+ E 〈 X S — 9 f — j 

since, as j — Jo and / — j’o ， 

I E(X 9 - - T) I 2 ^E\X 9 - X\ 2 ^E\X\ f — r I 2 — 0 ， 

and similarly for the two other r.h.s. terms. 

A. Convergence in q.m. criterion. Second order random functions 
X a {t) on T converge in q.m. as s Sq to some random function X{f) on 
T {necessarily of second order) i/ y and only if y the functions EX 人 d ， if) 
converge to a finite function on T y as s y s f Sq tn whatever way s and ? 
converge to s Q . Then TxJjy O ^ r〆 /，/’）on T X T. 

Proof. The “if"’ assertion follows, by the mutual convergence in 
q.m. criterion, from 

E\ X 9 (t) - XAt) I 2 = E\ X 8 (t) I 2 - EX 9 (t)XM - EXAt)Z(t) 

+ E\ z〆/) I 2 — r (/， /) — 2r (/， /) + r (/， /) 


= 0 , Sy s f —► Sq. 
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The “only if” and the last assertions follow from the foregoing lemma 
upon replacing X 9 by X 8 {t) and by 

A second order random function X{t) on T is continuous in q.m. at 
tCT\f 

X{t h) - > X{f) as h 一 > 0, / + A C! 7\ 

B. Continuity in q.m. criterion. X(t) is continuous in q.m. at 
t C. T ify and only if 、 Tx(t y H is continuous at (/, /). 

Proof. This follows by the convergence in q.m. criterion from 

lim EX{t + h)X{t + h!) 

A, A' — 0 

=lim r x (/ + A, / + h!) y f + h y t + h ; CT. 
h $ h f o 

Corollary. If a covariance T(t y t f ) on T X T is continuous at every 
diagonal point (/, t) C. T X T y then it is continuous on T X T. 

It suffices to observe that if T(/, t ; ) is the covariance of X(t) y then 
X(t + h) ^ X{t) y X(t ; + h f ) X(n> K h f 0, 


imply by a that 


EX(t+ h)X{t f + h!) ^ EX{t)X{t f ). 


A second order random function X(t) on T has a derivative in q.m 
dm (or 翊 ) at/G Tif 


dt 


X(t + A) — X{t) q . m . 


h 


> X\t), A — 0 ， t + hCT. 


C. Differentiation in q.m. criterion. X{t) has a derivative in q.m. 
at t C. T if y and only if 、 the second generalized derivative of Yx(t ， t’) exists 
and is finite at (/, /). 

This follows by the convergence in q.m. criterion from 

(X(t + A) - X(t) X( / + AO - X(t)\ 

lim E i -- 

k t h f o { h h 

1 

=lim 7 ^； 
hx — q hh! 

Corollary 1 . If the second generalized derivative of a covariance 
r (/， /’）on T X T exists and is finite at every diagonal point (/, /) C ^ X T, 
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• . d d d 2 . 

then the derivatives — r (/, t ; ) y — r (/, /’), - ； r (/, t f ) exist and are finite 

dt dt dtdt 

on TXT. 

It suffices to observe that, if r(/, /’） is the covariance of a random func¬ 
tion X{t) y then X\t) exists, and since “E” and “lim q.m.” can be inter¬ 
changed, it follows by a that 

_ r (/ + A ，/’） 一 r (/，/’） d 

EX\t)X(n = lim , — ..I., - - — = -r(/,0. 

a — ► o h 


Similarly for — r(/, /’)，and also for 

dt’ 




lim U-r(/, + A r ) - -r(/,o 

h f —oh’[dt dt 


dtdt ‘ 


r (/，/，)• 


Corollary 2. If X\t) on T exists，then Tx f (t y O = - , O on 

" ' dtdt 

TXT. 

This property extends at once to 


Assume that r(/,/’）is indefinitely differentiable on T X T y select for 
origin a fixed value of the argument, set 

x n (t) = X(0) + 7 翊 +•••+$ ， )( 0 ), 

1 n\ 

and form E\ X(t) — X n (t) \ 2 . Elementary computations yield 

Corollary 3. A second order random junction X{t) on T is analytic in 
q.m. i/y and only if 、 Tx(i y O is analytic at every diagonal point (/, /) C 7" X T; 
and then 1^x(/〆）is analytic on T X T. 

參 

37.3. Calculus in q.m.; integration. The investigation of integrals in 
q.m. follows the foregoing pattern but is somewhat more involved. 

Let X(i) and Y(t) on T be second order random functions with 
covariances r〆/，/’）and IV(/ ， 〆)• Contrary to our convention, we do 
not assume that they are centered at expectations. The reason is that 
we shall have cases in which either one or the other of these random 
functions degenerates into a nonvanishing sure function, while if they 
were centered at expectations the sure function would have to vanish. 
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The Riemann-Stieltjes integrals in q.m. are defined as follows: Let 

Di ： 泛 =/i < / 2 < • • • < / n +i = ^ 

be a finite set of consecutive points defining a partition of the finite 
interval I = [a y b) and set Dj \ = max (4+i — 4). If = 

k 各 n 
n 

E XkI[t k M{i) is a random step-function, we set 


I Xoi(f) dY{t) = ^2 Xk{Y(^k^x) ~ * 

I k 通 X 


In the general case of a second order random function X(t) y we set 
厶 = X{t f k ),t k < t\ < 4 + i 

)dY{t) = lim^ dY{t), 

f X{t) dY{t) = lim q.m. f X{t) dY(t)y 

4/ a — — V / 

6 + co 

(and similarly if only a ― 今一尤 oy b ， + 00 )， provided the second limit 
exists and the first limit exists for the sequence of partitions Di and 
is independent of the choices of the corresponding /a ； they are neces¬ 
sarily defined up to an equivalence. 

It is important to bear in mind that the preceding definition depends 
not upon the random function Y{t) but upon its increments ^Y{t ); 
in other words, the random functions Y(t) are to be considered as 
defined up to an additive r.v. Similarly, it is not the ^covariance 
r y (/, / ; ) of Y(i) which matters below but its increments △△’!>(/，/’) = 

A. Integration in q.m. criterion. Let the second order random func¬ 
tions X(i) y with or without affixes, be independent of the second order incre¬ 
ment function AY(t f ) on an interval I X I finite or not. Then 

J X(() dY{t) exists if, and only rx(/,0 dd'Tyit, t') exists 

and、if the integrals in q.m. which figure below exist，then 
E < f X s (t) P’(/’ ） ^F(/o| = ^X s {^X 9 'it')) dd'T Y ifyt')- 

The double integrals are usual Riemann-Stieltjes integrals. 
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Proof. The first assertion follows, upon starting with finite intervals, 
from the convergence in q.m. criterion applied to 

-e{J x Dl {t) 卿 ) Juo dT(n }； 

similarly for the second assertion upon replacing Xdi by X Dl 8 and X D ^ 

by W. 

Remark. The independence condition is certainly fulfilled when the 
random functions X{t) and Y{t) are independent or when X{t) or AY(l) 
degenerate into sure functions. On the other hand, it is used only to 
assert that for C / 

EX(t)X(n^Y(t)A f Y(n = 

and, hence, it can be replaced by this less restrictive condition. Finally, 
it can be suppressed altogether, provided the elements of double inte¬ 
grals are replaced by, say, dd f E { X{t)X{t f )Y{t) Y{t f )}. 

Corollary 1. Formal properties of Riemann-Stieltjes integrals such 
as finite additivity hold a.s.for corresponding integrals in q.m. 

The corollary follows by elementary computations. 

LetD:a = /i < & < • • • < / n +i = 々 andZ)’：a = /\ < /’ 2 < • • • < = b 

be finite sets of points defining partitions of the finite interval I : =k b). 
Let AY(t) = Y(i k+l ) - y(4) and = Y(t\^ +1 ) - denote 

the corresponding increments of Y{t) for / = 4 C D and 〆 =/V €1 D’• 
We say that the function IV(/，/’）is of bounded variation on I X I if 
there exists a constant ci such that 

Z Z \E Ay(/)AT(/0 I = E Z I AATy(/, /0 I ^ ^ 

t CD CD* t CD CD f 

whatever be D and D, • We say that ry(/, t f ) is of bounded variation 
on the infinite interval / X / if there exists a constant c such that 
r// ^ r < oo whatever be /’ C / and then we write, for short, 

J|^T r (/,/ , )| <oo. Clearly 

Corollary 2. If on I X, I the random junction X{t) is continuous in 
q.m. and independent of the increment random function and if the 

covariance t’）is bounded while the covariance ry(/, / ; ) is of bounded 

variation, then J X{t) dY{t) exists. 
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Let us observe that, under the bounded variation condition, the co- 
variance IV(/，/’）can be assumed to have the property lim IV (/，/’） 

t,t f — — 00 

= 0; it can also be normalized, that is, replaced by 

fy(/, n = I {T Y (t + 0,〆 + 0) + r r (/ - 0,， + 0) 

+ rv (/ + 0， /’ 一 0) + ry (/ — 0， /’ 一 o)} 

where the limits exist and are finite. Furthermore, because of the 
boundedness and continuity condition on I\y(/，/’)，the integral 

J JTx(t, t’) dd f Ty{t y t f ) remains the same if Ty is replaced by fy. 

37.4. Fourier-Stieltjes transforms in q.m. A covariance r(/，/’）is 
said to be harmonizable if there exists a covariance y(s y s f ) of bounded 
variation or\ R X R such that 

(H r ) r (/， /，） =JJ 严 dd'y{s, s'). 

A second order random function X{t) is said to be harmonizable if there 
exists a second order random function with a covariance y(s y s f ) 
of bounded variation on R X R such that 

(H x ) X(t) =Je iu ^(s) a.s. 

We intend to show that harmonizability of a random function implies 
that of its covariance, and conversely. The direct assertion follows at 
once from the integration in q.m. criterion, and the problem lies in the 
proof of the converse assertion. We shall use repeatedly the conver¬ 
gence and integration in q.m. criteria, without further comment. 

To begin with, let us observe that the bounded variation condition 
on y(s y s ; ) implies that harmonizable random functions are continuous 
in q.m. and harmonizable covariances are continuous and bounded. 
Moreover, y(s 士 0, / 士 0) and 汾士 0) ， $ ( 干如） exist. 

We denote by ! ⑺ and Aq$(j) the “normalized” and the “jump” 
of at s y defined by 

iCO = iUO* + 0) + 扑一 0)} ， △ o $C0 = $0* + 0) — — 0). 

Similarly, we denote by y(s y s f ) and A 0 A , 0 y(s y s f ) the “normalized” 
y(s y s ; ) and the “jump” of 7(^, s ; ) at (s y s f ) y defined by 
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y(s y s f ) = \{y(s + 0, j*’ + 0) + 7 (j + 0, j’ 一 0) 

+ y(s 一 0, j’ + 0) + y(s — 
AoA'ot^, s f ) = y(s + 0,〆 + 0) — 7 (j + 0, / 一 0) 

一 t(^ 一 0, s f *j- 0) + 一 


It follows that 


0，〆 一 0 )} 

0, s f 一 0). 


五 iCOlCO = T(J ， 〆) ， £A 0 ^(j)A ， oi(j / ) = AoA’o 今 (A 〆)• 


Let Ah and be difference operators of step h and A’ acting on s and 
/, respectively. Let a T (u) = sin rujru and let 

1 C TV sin u 

々 r ( 卩， 々） = 一 I - du. 

T Jr(v-h) u 


We use repeatedly the fact that b T {v y h) 0, 1, or according as 
v{v 一々 ） > 0 ， <0, or =0. 

a. Inversion of covariances. If a covariance r (/, /) is harmoniz- 
able, then y as r, r f oo ? 


and 



4rr f 



r (/, / ; ) dt dt f — > ？) 


4tt 


2 



+r /% +〆 




Proof. (Hr) entails, by elementary transformations, that the inte¬ 
grals are, respectively, 

JJa T (u — s)a T 如’一 s’) dd f y{u y u f ) y 

(卜 j ， h)bAu f - s\h f ) dd f y(u y u ; ) y 

and the assertions follow by letting r ， r’ 一 ■> oo. 

b. Inversions of random functions. If X{t) is a random junction 
with a harmonizable covariance r (/，/’)， then there exist random functions 
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△o 芒 CO with covariance A 0 A ; 0 y(s y s ; ) and |(j) % with covariance y(s y s f ) 
such that 、 似 r —> oo, 

1 /• +T 



^X(t) dt ^ AoK^r) 


1 

/% +r 1 

2t 

一了 △〆 

L r It 


q.m. 


△/ilCO. 


If the random junction X{t) is harmonizable with respect to 专 (j )，then 

r\ 

A 0 K j ) the jump junction of ^(s) and ^(s) is the normalized 各 (j). 

Proof. The first assertions follow readily from the foregoing lemma. 
The remaining ones can be deduced from the first ones or follow di¬ 
rectly from (Hx), upon observing that, by elementary transformations, 
the foregoing integrals become, respectively, 


Ja r (u - s) d^(u) y Jb, 


A, Harmonizability theorem. A random function is harmonizable 
if y and only if 、 its covariance is harmonizable• 


Proof, From 


x{t) = f … d^(s) y JJ| dd f ^s, /) I 


<00 


where t(j, /) is the covariance of it follows at once that 

EX(t)x(n = 


Conversely, let X{t) have for covariance 


=JJV 




dd 、 (s ， s，) with JJ I s ; ) I 


< 00 , 


Since the integrand is continuous and bounded and y(s y /) is of bounded 
variation, we can assume without restricting the generality that y(s y s f ) 
is normalized. According to the foregoing lemma, there exists a random 
function ^(s) whose covariance is y(s y s f ) such that 

1 T+ r 1 • q.m. 

— -— A h e^ lst X(i) dt —> AJ ⑺， r — oo. 

2tt it 

Upon applying the second parts of the foregoing lemmas, it follows by 
elementary computations that Y{t) = f e lta d^(s) exists and E\ X{t) | 2 
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= EX(t)7(t) = E\ Y(t) I 2 , so that E\ X(t) - Y(t) | 2 = 0, and the proof 

is concluded. 

Particular cases 

1° If s varies over a countable set only, then the integrals with 
respect to 专 (j) and y(s y s ; ) reduce to countable sums, and what precedes 
continues to hold. In fact, the proofs reduce to those of the first parts 
of a and b. 

2° If / varies over a countable set only, then the integrals with 
respect to dt y dt dt f reduce to countable sums and what precedes con¬ 
tinues to hold. In fact, the same proofs apply, provided r (/， /’） and 
X{t) are first extended by letting /，〆 vary over R in (H r ) and (Hj^). 

Remark. The following analytical problem is of interest: character¬ 
ize harmonizable covariances r (/，/’)， that is, harmonizable functions 
of nonnegative-definite type. The answer ought to reduce to Bochner’s 
theorem in the particular case of a continuous covariance r(/，/’）= 
/( / — /’)• The necessary condition is that r(/，/’）be continuous and 
bounded. Is this condition sufficient? If not, what supplementary 
conditions — which ought to disappear in the preceding particular case — 
are required? 

37.5. Orthogonal decompositions. Among various decompositions of 
second order random functions, the orthogonal ones play a prominent 
role. The physical reason is that orthogonal components can be iso¬ 
lated experimentally by means of suitable “filters •” The mathematical 
reason is that orthogonal decompositions correspond to the introduc¬ 
tion of a general form of cartesian frames of reference which allow the 
use of a general form of Pythagorean relation. We saw in the preceding 
subsection that in the case of a random function defined on a countable 
set of values of the argument such a decomposition is always possible 
and the frame of reference can be obtained by linear combinations of 
the random values of the function. We intend to proceed to more 
general orthogonal decompositions of the same character. First let us 
give two countable decompositions. The one below follows from 34.2B, 
Corollary 3 by elementary computations. 

A. Orthogonal expansion theorem. The expansion 

t t 2 

义 (/) ⑼ + 7 义 " ⑼ 十 …， tCly 

of a second order analytic random junction X{t) is an orthogonal decom- 
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position if y and only if 、 its {analytic) covariance t’）is a function of the 
product tt f of its arguments in / X /. 

In fact 、 every random function X(l) continuous in q.m. on a closed 
interval I has a countable orthogonal decomposition. We shall use 
Mercer’s theorem which states that if a nonnegative-definite type func¬ 
tion r(/，/’）is continuous on / X /， then 

r (/， n = Z\ K |Vn(/)^n( / 0 , 

where the series converges absolutely and uniformly on / X /, and the 
continuous functions ^ n (/) are “proper functions’’ of r (/， /’） correspond¬ 
ing to ‘‘proper values” | X n | 2 : 

/’) 々 n(〆) = Xn | 2 ^nW* 

Proper functions which correspond to (necessarily finitely) multiple 
proper values are written with distinct indices, and all proper func¬ 
tions are orthonormalized on I: 

B. Proper orthogonal decomposition theorem. A random junc¬ 
tion X{i) continuous in q.m. on a closed interval I has on I an orthogonal 
decomposition 

X{t) = Yj 入 

with 

= ^mny J dt = 5 mn , 

if y and only if 、 the \ X n | 2 are the proper values and the \f/ n (t) are the ortho¬ 
normalized proper junctions of its covariance. Then the series converges 
in q.m. uniformly on I. 

Proof. Let /， /’ vary over /， and 

n 

k 讎 1 

If X{t) has the asserted decomposition, then 

= lim EX n (t)Xn(n = Z\K |Vn ⑽ n (，)， 

and the “only iP’ assertion follows* Conversely, let the be the 
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orthonormalized proper functions of the covariance r(/，，) of X{f )、 
and form the integrals 

K^n -JX(f)\p n (t) dt. 

These integrals exist, since X{t) (in q.m.) and \p n (t) are continuous on 
the closed interval /， and 

= ^mny EX(J)$ n = X n # n (/). 

It follows from Mercer’s theorem that E\ X{t) — X n {t) | 2 — 0 uni¬ 
formly on I y and the “if” assertion is proved. 

In physics, the most important orthogonal decomposition is the 
harmonic one, for, loosely speaking, it yields “amplitudes,” and hence 
“energies,” corresponding to the various parts of the “spectrum” of 
the random function, and we seek it now. But, first, we have to intro¬ 
duce random functions which correspond to sums of orthogonal r.v.’s. 
It will be convenient to denote the increment of a function, say 芒 (/)， 
on an interval [a, b) by ^[a y b) = ^{b) — ^{a). Increment functions are 
characterized by their additivity: 

咖，古） + 似 ， r )= 办 〆 ） 

and determine the point functions 忘 (/) up to additive quantities. While 
what follows is valid for more general ordered sets T, we shall assume, 
to simplify the language, that T (Z R. 

A second order random function $(/) has orthogonal increments if, 
for disjoint intervals [a, b) y [a\ b f ) y 

E^[a y b)l[a\ b f ) - 0. 

Then 

E\ b) ± V) I 2 = E\ ^[a y b) | 2 + E\ V) | 2 , 

and it follows, by setting, for some fixed A, 

五 /) I 2 = F(/), E\ ^[/, a) I 2 = 一 F(t\ 

that 

e\ $[/, n | 2 -f[/,o ； 

for short, 

E\ d^{t) I 2 = dFify. 
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If 7 (/, t f ) is the covariance of 专 (J*)，this relation becomes 

7(〆 ， 〆 ） 一 7(〆，/) 一 7(A 〆）+ 7(/，/) = FiH — d 


for short, 


dd f y{t y n = dF(t) y 


In words, the increment of the foregoing covariance over a two-dimen¬ 
sional rectangle with sides parallel to the -axes of / and of /’ reduces to 
its increment over the square whose diagonal is the part of the line 
/ = /’ belonging to the rectangle (draw the figure). 


Observe that the second moments of the increments of $(/) are 
bounded (by some fixed finite number) if, and only if, the finite non¬ 
decreasing function F(t) is bounded on T. Whether bounded or not, 
this nondecreasing function extends to a nondecreasing function (finite 
or not) on i? = ( 一 00 ， + 00 ) and, in fact, on R = [ — 00 , + 00 ]. This 
leads to 


C. Extension theorem. Ij the random function ^(/) on T has orthog¬ 
onal increments with bounded second moments ， then y preserving this prop¬ 
erty^ 芒 (/) can be extended by continuity to the closure of T and can be 
extended on R. 

Proof. Let r be a limit point of T from the left (it may or may not 
belong to T and may be +<»)• Since F{t) is nondecreasing and bounded 
on T, F(r 一 0) = lim F(t) exists and is finite, and as /， /’ 丁 r(/ > /’） 

tU 

(1) E\ $(/) - m I 2 = F(t) - F(/，）— 0. 

It follows that the r.v. $(r 一 0 ) = lim q.m. {(/) exists, and for / ^ r 

tU 

E\ $(/) 一 $(r 一 0) |2 = lim 五 I {(/) 一 ^(/ ; ) [ 2 = F{t) 一 F(r 一 0), 

^ T T 

Similarly if r is a limit point of T from the right. If r is a limit point 
from both sides, then, by letting t [r and /’ 丁 r in (1 )， we also have 

E\ ?(r + 0) — - 0) I 2 = F(r + 0) - F(r - 0). 

Now, if r ^ T is a limit point from the left set F(r) = F(r — 0 )， 
Kr)= :每 (r 一 0 )，and if it is a limit point from the right but not from 
the left set F(r) = F(r + 0 ), $(r) = $(r + 0 ). This provides the 
asserted extension on the closure T of T. For it suffices to write that 
the increments are orthogonal on T and let one or more end points 
approach points r ^2 T — T; in particular $(r + 0) — 爸 (r 一 0) is 
orthogonal to increments on intervals disjoint from {r 卜 
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Finally , 及一 7" is a countable sum of intervals I with at least one 
end point r of / belonging to T. By setting F(f) = F(r), 芒 (/) : =?(r) 
on the /’s, we obtain the asserted extension on R. 

When the boundedness condition is not satisfied, then the same proof 
shows that the asserted extension exists on the smallest interval contain¬ 
ing T y provided the end points which do not belong to T are excluded. 

Corollary 1 . Under the hypotheses of the extension theorem y the ran¬ 
dom function $(/) is decomposable into two parts with mutually orthogonal 
increments: 

$(/) = m + m, 

where $d(/) is a sum of orthogonal r.v.'s (converging in q.m. when denumer¬ 
able) and f c (/) is continuous in q.m. y and 

E\ dUt) I 2 = dF.it), E\ dUi) | 2 = dF c {t\ 

where Fdif) and F c {t) are the purely discontinuous and the continuous 
parts of F{t) y respectively. 

Extend on 及 ， let {tj} be the (countable) discontinuity set of F，and set 

+ 0) — 你一 0))，$ c (/) = $(/)— 每 d(/)• 

ti<t 

The assertions follow by elementary computations. 

We say that a random function $(/) on T with covariance 7 (/, t f ) is 
orthogonal to its increments 、 if for / < 〆 

Emw) - m = 0 , that is, E^mn = e\ m 1 2 . 

Clearly, every such random function {(/) has orthogonal increments 
and the definition is equivalent to 

7(/, /0 = ： F(t) nondecreasing, / S 〆 ； 

if, moreover, F(t) is bounded on T, then the foregoing extension on 及 
preserves this relation. Conversely 

Corollary 2. If a random junction ^(/) on T has orthogonal increments 
with bounded second moments y then it can be made orthogonal to its incre¬ 
ments by a suitable change of origin of its random values. 

Extend $(/) on R and subtract — 

Thus, in the integrals below with respect to random functions $(/) 
which have orthogonal increments with bounded second moments, we 
can assume that the $(/) are orthogonal to their increments. 
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Consider now the integration in q.m. criterion and assume that the 
random function Y(t) therein is a function 芒 (/) with orthogonal incre¬ 
ments. Then, clearly, the double integrals therein reduce to simple 
integrals with respect to the nondecreasing function F(t) defined above. 
In particular, the harmonizability theorem becomes 

a. Let E\ 4C0 | 2 = dF{s). A second order random function X{t) is 
of the form 

X{t) = f e { Ks) 

i/ y and only if, its covariance T(t y / 7 ) is of the form 

r (/, O = J e^ tfU dF(s). 

For such covariances, the question raised at the end of the preceding 
subsection has a very simple answer (Khlntchine, Kolmogorov). 

b. A covariance T(t ， t’）is of the form 

r (/, t f ) = J ，卜 dF(s\ Var F < oo 

where /, vary over R or over {• • • 一 1 ， 0 ， +1 ， …] if、and only if y it 
depends only upon the difference t 一 t’ of its arguments^ and when /, t f 
vary over R it is, moreover^ continuous. 

Proof. The “only if” assertion is obvious* The “if*’ assertion fol¬ 
lows from the fact that a covariance is of nonnegative-definite type. 
For, the definition of this type reduces to that of nonnegative-definite 
functions when r(/，/’）=/(/ 一 /’）， and then Herglotz’s and Bochner’s 
theorems apply. 

A covariance which depends only upon the difference of its argu¬ 
ments is said to be stationary. A random function with a stationary 
covariance is said to be second order stationary. This concept is closely 
related to the usual concept of stationarity (of distributions); for, if a 
random function is stationary in the usual sense and is of second order, 
then it is a fortiori second order stationary, and in the strongly normal 
case the converse is true, A stationary random function need not be 
second order stationary since it may not be of second order. 

Together, the preceding two lemmas and the inversion lemma for 
random functions become 
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D, Harmonic orthogonal decomposition theorem. A second order 
random function X{t) on T = R or {*• * — 1, 0, +1, — •} has a harmonic 
orthogonal decomposition 

Z(/) = f 户 ^(s) y tCT y 

where E\ d^{s) | 2 = dF{s\ and F(s) is of bounded variation on T 
if y and only if y 

X{t) is second order stationary and in the case T — R also continuous in 
q.m. at one point t. 

Then’ as t <^ y 

J / »+r m 

— I e 一 ist dt — > A 。 专⑺， 

2 r r 

1 C + T 1 . q.m. ^ 

~ I 一 t* 么 he 一 1 st X{t) dt — > Ah^(s) y 

2t J^t tt 

where the integrals reduce to sums when T = { … 一 1 ， 0 ， 1 ， •••}• 

Corollary 1. A second order stationary random function on {••• 
— 1,0, +1, • • •} extends on R to a second order stationary random func¬ 
tion continuous in q.m. 

Corollary 2. A second order stationary random function X(t) con¬ 
tinuous in q.m. is decomposable into two orthogonal and second order 
stationary parts X{t) = Xd(t) + X c {t) y with 

x d {t) = f 一 dUs), x c {t) = JV •“ dUt) 

where the first integral reduces to a countable sum converging in q.m. 

What precedes applies to second order r.f.’s of the form X(t)= 
{X u {t) y u C U} y t T y that is, whose random values X{t) are second 
order random functions of /: E\ X u {t) | 2 < 00 whatever be / and u. 
It suffices to consider the argument (u y t) ^2 U X T. However, the 
arguments u and / may play a nonsymmetric role. For example, the 
above random function is second order stationary in t if 

EXuidin 

Clearly, this property is equivalent to the following one: the random 
functions Yun{t) = c u X u {t\ t d are second order stationary 

U C Un 

whatever be the complex numbers c u and whatever be the finite subset 
U n C U. Upon applying the harmonic orthogonal decomposition 
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theorem to the random functions Yu n {f) y we obtain by elementary 
computations 

Corollary 3. A random junction X{f) = {X u (f) y u C, U\ continuous 
in qjn. on R is second order stationary i/ y and only if 、 

EX u {t)X uf {t f ) = 介 ( 卜 ") ， dF u ， As\ u y u f CU 

where the functions F UtU f(s) are of bounded variation on R and the func¬ 
tions AhF UtU f(s) are of normegative-definite type in u y u f for any s and 
h>0 ’ 

or i/ y and only if 、 

Xu{t) = Je it9 dUs)y uCU 

where the second order random functions 专 u (s) have mutually orthogonal 

increments with _ 

EUs)^(s f ) = F UtUf (s) for 

If U contains only two elements, the first assertion reduces to a result 
of Khintchine. If U contains only a finite number of elements, then it 
reduces to a result of Cram6r. 

Another nonsymmetric role of the two arguments u and / appears 
in considering them as the real and imaginary parts of a complex argu¬ 
ment z = u it. The definition of second order stationarity becomes: 
the random function X{z) is second order stationary if its covariance 

r^( 2 , z f ) = f(z + z f ) =/(u +«’ + /’(/ — /’)） 

depends only upon z + z f . Then, for every fixed u + u\ the covariance 
is stationary in /，• /’， the harmonic orthogonal decomposition theorem 
applies, and it is not difficult to obtain 

Corollary 4. A random function X(z) y z = u //, continuous in 
q.m. in the complex-plane strip S: a < u < b y is second order stationary 
if, and only if 、 for z, z f C S y 

T x (zy z ; ) = dF{s) y Var F < oo, 

or, if and only if、for zCS y 

X(z) = f 抛 

where the random function ^(s) is orthogonal to its increments with s f ) 
= F{s) for s < s\ 
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Remark. By replacing the argument / by a complex argument z in 
the preceding extension, Corollary 3 extends to random functions of z. 
It suffices to replace therein / by z and 〆 by z\ 

— 37.6. Normality and almost-sure properties. The class of normal 
r.v.’s is closed under linear combinations and passages to the limit in 
q.m. Since differentiation and integration in q.m. (with one of the 
two functions being a sure function) are obtained by such operations, 
it follows that the stability of normal laws (so far considered only for 
sums of independent r.v.’s) extends to the calculus in q.m.: 

A. Normal stability theorem. Normality is preserved under differ¬ 
entiations and integrations in q.m. 

In fact, the foregoing remarks apply word for word to second order 
random functions which obey infinitely decomposable laws — necessarily 
with finite variances (that is, all the linear combinations of the random 
values obey such laws). But more is true when the second order ran¬ 
dom functions are normal: many of the properties in q.m. established 
in this section become then a.s. properties. We saw that to every co- 
variance there corresponds a strongly normal random function and，for 
the random values of such functions, orthogonality becomes inde¬ 
pendence. If a series of these orthogonal values converges in q.m., 
then, the summands being independent, the convergence is almost 
sure. Therefore, upon applying the foregoing theorem to the three 
orthogonal decomposition theorems of the preceding subsection, we 
obtain 


Corollary 1 . If the covariance Tx(fy t f ) is an analytic function of 
tt\ then the random function X{t) is normal i/ y and only if 、 its derivatives 
in q.m. are normal. If 、 moreover^ X{t) is real-valued、then its derivatives 
are independent and the Taylor expansion of X{t) converges a.s. 

Corollary 2. If the covariance Tx(f y t f ) is continuous on a closed 
interval /, then the random function X{t) is normal if y and only if、the 

r.v.'s \ n ^ n = JX(t)\p n (t) dt are normal 

I/ y moreover^ X(t) is real-valued、then the h are independent and the 
proper decomposition of X{t) converges a.s. 


Corollary 3. If tfie continuous covariance (/，/’）" stationary^ then 
(i) The random function X{t) is normal if 、 and only if 、 the random 


function which figures in its harmonic decomposition X{t) 


，山我 (s) 
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is normal. If |(/) is strongly normal、then its increments are independent 

X 6 as n 

— ^ I cis u — ► 一 co y b — ► 

(ii) = ^(/) + $ c {f) is the decomposition of ^(/) with independ¬ 

ent increments into its purely discontinuous and its continuous in q.m. 
parts y then X{t) = Xd{t) + X c {t) is decomposed into two independent 

P 咖 X d (A 十 dUs) -ich is a ” a.s. convert series of inde- 

pendent r.v's y and X c {t) = ^ e lis ^ c (s) which obeys an infinitely decom^ 
posable law with finite variances• 


Only the very last assertion deserves proof. It is due to the fact that 
E\ d^{s) 1 2 = dF c {s) y where the function F c (s) on R is nondecreasing, 


bounded, and continuous. Thus, R can be subdivided into intervals I 
on which the increments of F c (s) are bounded by e > 0 arbitrarily 
small. It follows that X c {t) is a sum of an arbitrarily large number of 



independent r.v.’s | e xts 我 c (s) whose variances are uniformly bounded 


by an arbitrarily small number, and this implies the assertion for every 
r.v. X c {t). Since the same is true of every linear combination of these 
r.v/s, the random function X c {t) obeys an infinitely decomposable law. 

Observe that the “if and only if” assertions remain valid when 
“normal” is replaced by “infinitely decomposable.” 

37.7. A.s. stability. If the second order random function X{t) with 
covariance r (/， /’）is not normal, we have to impose supplementary 
.restrictions upon the covariance in order to transform properties in 
q.m, into a.s. properties. We shall content ourselves with conditions 
for a.s. stability corresponding to the strong law of large numbers. If 
the random function is defined on the set of all integers, we look for 


1 n a .s. 

Y{n) = - X) Z (是 ）一 ^ 0, and if it is defined on T = [0, +<»)， we look 
" n jfc-i 


A I « 

for Y(t) = - I X{t) dt 二 ^ Q where we assume X{t) continuousiin q.m. 

^ T J 0 
on T. 


In both cases, the conditions we shall find and the proof are the same, 
except that in the second case the integrals are to be replaced by sums. 
Therefore, we shall give the proof in the slightly more involved case of 

T = (0, +oo). 

First, we reduce the random function Y(r) to a sequence of r.v.’s. 
We consider a sequence m a ^ m = 1 , 2, • ••，and the symbols c y c f 
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are finite positive constants while ^ is a nonnegative constant. For 
m a ^ r < (w + l) a , we can set 

^ nr) = Y(W) + 咖 V) ， r)=— a fx{t) dt. 
m a m a Jm a 


Since 


U(m a ) = sup I Z(m a y r) | ^ — 
' m a ^r <(m + l) a m a 



(m + 1 ) 


I X(t) I dt. 


we 


have 


E \ I 2 ^ 



(m + l) a /'(m-fl) a 


E\ X{t)X{t f ) I dt dt f 




/ 1 广 +” a > - \ 

bL V 


2 


We are led to assume that r(/, /) ^ ct 2h so that 


EE\ U(m a ) | 2 ^E 


(w + iy 


一 m “b + l) 2 〜 


t 2(l—ab) 


and the series converges when ab < 1/2. Then by Tchebichev’s in- 

equality and the Borel-Cantelli lemma U{m a ) ― > 0. 

Similarly, by taking the sequence = (1 + e) m with e > 0, we find 
that 


1 广 f+ l 

U( q n J m I X(t) I dt a. 


and are led to assume that X{t) | ^ c y so that 


U{q m ) ^ c 


q 一 q 


ce a.s. 


and, hence, U(q m ) is arbitrarily small for e > 0 sufficiently small. Thus 


a. I/ y fort sufficiently large ， 

(i) r(/, /) ^ ct h and ab < 1/2, then Y(t) — Y{m a ) —> 0 ^ r —> <». 

(ii) I X{t) I ^ c y then Y(t) — Y(q m ) becomes a.s. arbitrarily small for 
穿一 1 > 0 sufficiently small. 
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It remains to insure the a.s. convergence of the sequences Y(m a ) or 
Y{q m ) to zero- This is obtained as follows: 


A. A.s, stability theorem. Let the random function X{t) on T = 
[0, +oo) with covariance r(/, t f ) be continuous in q.m” and let c y c\ y be 
finite positive constants. 

If 、 for large r > 0 



r(/, /) ^ c and 


r(/ ， /’ ） dtdt f ^ 


o 


or 

(ii) I X{t) I ^ r and f % f f dt dt f S c, 

Jo T ^0 ^0 

then、as t ^ ^ 

(iii) - fx{t) dt ^ 0 . 

tJq 


The same is true if X{t) is defined on the set of all integers、the integrals 
being replaced by the corresponding sums. 

Proof. The first condition in (i) permits to apply the first part of 
the preceding lemma with 彡 = 0 and arbitrary a > 0. The second 
condition in (i) yields 

⑴ ZE\ Y{p) I 2 = Z- 2 r 厂 r(/ ， /V/ 泣 ’ < co 

p p P J Q J Q 


for p = m a with a > l/y. The first assertion follows. Similarly, the 
first condition in (ii) permits to apply the second part of the preceding 
lemma, and the second condition in (ii) yields by elementary computa¬ 
tions (1) with p = g m however small be ^ — 1 > 0. The second asser¬ 
tion follows. 

If the random function X{t) is second order stationary either on 
T = {• • • —1 ， 0 ， +1 ， • •.} or continuous in q,m, on T = R y then the 
first condition of (i) holds for the random function e— lst X{i) whose 

covariance is f 厂 1 ( 卜仙 一 〆 ） dF{i) with the function F(s f ) of bounded 

variation. Upon replacing r (/， /’) by this expression in the second 
condition of (i)，we obtain 

Corollary. Let the random function X{f) be second order stationary 
and continuous in q.m. on T = R. If there exist two finite positive con¬ 
stants c and 7 such that、for large r > 0 
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sin 2 一 （ J* 一 s f ) 



2 


4 


dF{s f ) g 


Or — /) 2 


c 




then as 


00 . 



r 


e 


'ist 


X{t) dt 


a.s. 


0 . 


o 


The same is true if X{f) is defined on T =I • • • — 1 ，0 , +1， • • • }，t being 
replaced by n and the last integral being replaced by the corresponding sum. 

The reader is invited to compare the last assertion with the second part 
of the stationarity theorem (with r = 2 ). 

It is easily seen that the conclusion holds a.e. (in Lebesgue measure) 
in *r. In fact, if the symmetric derivative 


FCr 0 ) = lim 


F(s 0 + 幻一 F(s 0 - h) 

2h 


exists and is finite (and this is true a.e., since the d«f. F has a.e. a finite 
derivative), then the integral which figures in the hypothesis is 
0{F f {sQ)/r) y and hence the conclusion holds with s = 〜 It suffices to 
take s 0 for origin of values of s and use the relation 



sin 2 Ts 

~t7 


dF{s) = F ⑼， 


which can be proved as follows : Write the integral in the form 

sin 2 7V 



o 


Ts 


2 


^lF(s) -F(-s)} y 


split it into 


F(s) U 



select ^ > 0 for a given € > 0 so as to have 


釋 ) 


< € on ( 0 , a) y integrate by parts the integral 


， and let T ^ oo and then e 


0. 


a 
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COMPLEMENTS AND DETAILS 


In what follows F(/, t f ) denotes the covariance of a second order random 
function X{t) with EX{t) = 0 on T CZ /?, unless otherwise stated. _ 

/• r (/， 〆 ) on R X R is said to be a triangular covariance ifT(/，/’）= 

/ S 〆• Such a product, with F2 9 ^ 0 , is a covariance if, and only if, F1/F2 is 
nonnegative and nondecreasing on R. Construct a few random functions with 
triangular covariances. What about random functions orthogonal to their 
increments? 

2 . Let Af(/, t f ) denote the complex-conjugate of the function M(/，/’）on 
TXT. 


If f f M(/ y T)r(T ， r / )A/(r / , t f ) dr dr 1 exists and is finite, then it is 

of the random function | Af(/, r)X(r) dr. What about X) M(t y ? 

Jt 

Application. The iterated r (n) (/， 〆）is defined by 


covariance 


r (m+n ) (/ ， 〆 ） =f r (⑹ (/， r)T^\r y t f ) dr 

Jt 

assumed to exist and be finite. Every iterate of a covariance is a covariance. 
(This follows for r (2n+1) (/， /’）from what precedes. For r <2n + 2) (/， /’）begin by 
verifying directly that r (2) (/， 〆）is a covariance.) 

J. Let H bt z family of functions h on T forming an euclidean or, more gen¬ 
erally, a Hilbert space; denote — by overabusing the notations — the scalar 
product by (A (/)， 々’(/))• A function r (/， 〆）on T X T is said to be a reproducing 
kernel of H. if T(/, t f ) d H for every fixed /’ and r (/， 〆）reproduces every h d 
that is, h(n = (A(/) ， r(/ ， O). .. 

r (/， /’）is a reproducing kernel of some family H if, and only if, it is covariance 
of a random function X{t) on T. If the Y y s are limits in q.m. of all possible 
linear combinations of random values of X{t) y then there exist Y f s such that 
h{t) = EYX{t) y (Ai(/), 心⑹ = EYxY 2 . 

Examples 

1 。 Let T = {/1, h y . • •} • Consider the space of all sequences {A(/i), h(h), • • •} 

with X) I A(/ n ) I 2 < °° ， (Ai(/), A 2 ⑹ =X) Ai(/n)A 2 (,n). Then r (/ m , / n ) = 

X(f) on T are second order random functions with orthonormal random values. 
2 ° Let T = R and F on be a nondecreasing function of bounded variation. 

Consider all functions of the form h{t) = ^ itx g(x) dF{x) with g(x) | 2 dF{x) 

< 00. Then r (/， 〆)= dF{x) y X{t) on T are second order stationary 

and continuous in q.m. 

4 . Let {/1, h y • • •，} C r, set 



r (/， 〆 ） 

r (/， /1) … 

r (/，/ n ) 

r\ / * * * > 一 

D V 7 , … 山 )- 

r (/ i , n 

♦ » • ♦ 

r (/ i , /1) … 

r (/ i ， /fi) 


r (/ n ， 〆 ） 

r (/ n ，/ i ) … 

r (/ n , u ) 


and denote by D(/ ， /1， •••，&) the square root of this determinant when /’ = /• 
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Assume that the X(u) are nondegenerate and are linearly independent. Then 
D 2 (h y ••• ，、） > 0 . 

(a) The foregoing determinant is a covariance, say, of the random function 
D(h ， •.. ， / 此 (/) where 



m 

r(/, /1) 

… r(/，/ n ) 

Y n (t) - n 1 

増 

r(/i, /1) 

• - - r(/i, / n ) 

V ； • • • ， / n ) 

• ♦ • 

X{u) 

♦ 釀 》 • 

T(/n, /1) 

… r(/ n , / n ) 


Set y 0 (/) = X{t) y Y n {t) = X{t) - 23 Crik(J)X{t k ) y « = 1 ， 2, • • •• Then Y n {t) are 

k 

determined either by the condition that they be orthogonal to the X(i k ) or by 
the condition that its variance be the smallest possible. We have 

D 2 (/) ^ D 2 (/, ti)/D 2 (h) 2 • • •$ Z) 2 (/ ， /i, … ， / n )/Z) 2 (/i， …， / n ) g … 

and 

Z ) 2 ⑹ Z ) 2 ⑹ ••• D 2 ⑹ g D 2 (/ 1} •••，、)• 

(b) Set = U 一 i(4)/*V^£| 1 ( 4 ) | 2 . Then E^k^i = 8k ,1 and 

Y n {t) = X{t) — a k{i)^k y ak{t) = EX(d 

k »1 

The sequence Y n {t) Y(f) such that 

Y(/ n ) = 0, EY{t)X{t n ) = 0, 五 Y(/)P(〆）=lim D 工 •• ： ：， •••，&) 
and 

X{t) = X^{t) + Y{t) 

with 

= 0, ^ooW = 21 a n{f)^n* 

n»0 

Let A) = /,/„= U n t where C/ is a transformation on T to 7 ； so that = {/ n } 

moves with/. Let 7?(/ n ) = Y(u)/yE\Y{ujJ\ « = 0, 1, …， so that the 7 ?(/ n ) 
form an orthonormal system moving with /• 

The random function X{t) is decomposable into X{t) = X x {t) + X 2 (f) where 
the random functions Xi(f) and X 2 (f) are orthogonal on { U n t), 

^l(^) = H ^n(/)^i(/ n ), 义 2(/) = X) ^n(/)77(/ n ) 

t n n «0 

with 

Z )(/, ’!•，•••)/、 、，/、 , 、 - 

D (/ 1} / 2) •••)”(’) = x ^) - “fed Ehh = 

(c) What are the properties of ch.f/s and second order stationary random 
functions which follow from (a) and (b). 

5, To every continuous stationary covariance there corresponds a random 
function of the form X{t) = ae^ where a and 0 are independent r.v/s. Find 
its harmonic decomposition. 
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6. Let |(/) on [0, +oo) be a random function with orthogonal increments 
and E\ ^(/) | 2 = t. Investigate the random functions X{t) = 

XQ)= 吩 + 1) — |(/)， /(/) = (/ + r) d^{r) with j | g(l) | 2 dt = 1, 


m 


- r) n 炎 (/) ， X{t) = f r T - I d 汾一 t). 
s Jo 


7. Let the random function X{t) be defined for / = w = 0 ， 1 ， 2,… • X{t) is 
second order stationary and a second order chain if, and only if, T(m y n )= 
j{m — n) =/(0)/ a+<W(m ~ n) with ^ ^ 0. What can be said about the har¬ 
monic decomposition of such random functions and of their covariances? 

Extend to X{t) on [0, +oo). 

8. It is assumed in what follows that the random functions under considera¬ 
tion are of second order and that the integrals and derivatives are in q.m. and 
exist. In every case the reader shall write the conditions for existence in terms 
of covariances and assume or prove that they are fulfilled. 

Let the random functions ^(s) and tj(s) on R be independent. Let 

X{t) = J e its d^{s) and Z{t) = J^ Us v( s ) 炎 CO. We say that Z(/) is obtained from 

X{t) by a linear operation with gain rj(s). Usually, it is assumed that 7j(s) is 
degenerate (into a sure function) and that X{t) is second order stationary and 
continuous in q.m., that is, E\ d^{s) | 2 = dF{s) y Var F < <». 

(a) Express Tz in terms of and I\; find its form in the usual case. 

Interpret - as a linear operation; what is the corresponding gain? 

dt 

{/(/) ， Z(/)} is second order stationary and continuous in q.m., if, and only 
if, so is X{t). 

(b) Express A^(/) = X{t + 0) — Z(/) in terms of 专⑺ and 7j(s). We say that 
A$(f) Is the error function when X{t + Q) is replaced by Z(/). Express in 

terms of and In the usual case r〆/，/) = J*| e i9s — 7j(s) j 2 dF{s) and is 

minimized for = 7jo(s) such that for all /， J e it8 rio(s) dF{s) = J e it8 e i9s dF{s) y 
provided such an ”。⑺ exists; the difference for rjoCO and any rj(s) is given by 
f I Vo(s) 一 7 j(s) j 2 dF(s). The linear prediction problem is that of existence and 
determination of rjo(s). 

(c) If ri(s) ^ Je^ ist dY{t) y with same affixes for ri(s) and Y{t) if any, then 

Z{t) — ^X{t — r) dY{r) is equivalent to Z(/) = J e iis 7 i(s) We say that 

the convolution is a filtering of X{t) by Y{t) with gain ri(s). What are X{t) and 
Y(/) in the usual case? 

Filtering by Yi(f) + is a filtering with gain rn(s) + 772 (^). Filtering by 
Y\(/) and then by Y^{t) is a filtering with gain 7ii(s)ri2(s); find the corresponding 

琳 

(d) A convolution defined by Z(/) = (/ — r) dY{r) (without reference to 

7 ]{s) which may not exist) will be called averaging of Y{t) by X{t). Usually, 
this terminology is used when X{t) is degenerate and E\ dY{t) j 2 = dt. 
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Let X(f) be degenerate, let E\ dY{t) | 2 = dG{t) and denote by fx the c-finite 
measure on the Borel field in R determined by the finite function G on R. Then, 
up to a constant factor Tz is a ch.f. with a ^--continuous d.f. and we can write 


Z(/) = Je ita g(s) dY{s ); find g(s) in the usual 


case. 


Conversely, if Tz has the 


foregoing properties, then Z(/) is an averaging of a Y{t) with the foregoing prop¬ 
erty by a degenerate X{t). 






Part Five 


ELEMENTS OF 
RANDOM ANALYSIS 


As soon as random functions on more general sets than sets of in¬ 
tegers appear, random analysis comes into its own. It is concerned 
with analytical properties and, in particular, with local ones such as 
continuity. The most important types of random functions isolated so 
far are the decomposable, martingale, and Markov ones. Foundations 
of random analysis and analysis of decomposable and martingale types 
are due primarily to P. Levy and to Doob. Analysis of the Markov 
type was founded primarily by Kolmogorov and by Feller. 

Investigations of random functions rely very heavily upon the par¬ 
ticular case of random sequences. But, by their very nature, they are 
on a higher level of mathematical sophistication. The less involved 
portions may be covered first: 38.1, 38.4, 39, 41, 43.1, 44.1. 








FOUNDATIONS ； MARTINGALES 
AND DECOMPOSABILITY 


§38. FOUNDATIONS 

Random functions were defined as families Xt = t C T) of r.v.’s 
on some pr. space (12, (i, P). According to convenience, the random val¬ 
ues at t are denoted by X t or X(t) } the values of X t at co are denoted by 
X t (co) or X(o> y t) and, unless otherwise stated, co and s y /, u with or with¬ 
out affixes, are elements of and of T y respectively. 

Since a function is a mapping of a space — its domain, to a space — its 
range space, the above definition is to be completed by specifying the 
domain and the range space of the random function. We are at liberty 
to select them according to the argument o>, t or (w, /)• Then, in order 
to proceed to the analytical study of random functions or random anal- 
ysisy concepts such as extrema, continuity, measurability, are to be in¬ 
troduced. Thus a-fields and/or topologies (that is, concepts of limit) are 
to be selected in the domains and/or range spaces. There was no such 
problem for random sequences: The domain was the set of natural in¬ 
tegers w = 1 ， 2， • • • with limit as w — qo. The range space was the 
space of r.v.’s on some pr. space with limits in pr” a.s., in the rth mean. 
Analytical questions, such as continuity in 4:he argument or measurability 
of the limits, did not arise except for the existence of limits as n —> qo. 
However, for more general families of r.v/s these questions are to be 
given a precise meaning before proceeding to the study of analytical 
properties of various types of r.f/s. 

In the literature, the terms “random function, M “random process” 
and “stochastic process” are treated as synonymous. In fact, “process” 
means sometimes a family of r.v.’s and sometimes a class of such families, 
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and it is deemed preferable to separate these meanings. A random junc¬ 
tion (r/.) will be a family Xt = i €1 T) of r.v.’s* A (random or 
stochastic) process will be a class of r.f/s with a common conditional law 
(given the “initial” or "boundary” or “lateral” conditions). In intuitive 
terms, consider the argument t belonging to, say, T = [0, oo) as “time” 
and the conditional law of a r,f. Xt given, say, Xt^ T 0 C T y as its "law 
of evolution.” Then a process (Xt Xr 0 ) is the class of all r.f/s on our 
pr. space with the same law of evolution £(Xt | Xt q ) } and to every 
choice of Xt q there corresponds a rX of* the process. ‘‘Markov’’ proces¬ 
ses, whose analytical properties are investigated in the next chapter, are 
of this type with T 0 = {0} and those which lend themselves to a de¬ 
tailed analysis are regular ； 

We say that {Xt | Xo) is a regular process if there exists a regular c.pr. 
P Xo of events defined on, the process. There are two ways of viewing a 
regular process• The first one corresponds to underlying laws only ： 
the process is a family of r.f.’s Xt with laws determined by the family 
of laws & (Xt | Xo = x) y xC.X } and the choice of the initial law £(Xo) 
on S. The second one corresponds to underlying functions 一 the process 
is a family Xt of measurable functions X“ t C T y on a measurable 
space (fl, GL) to a measurable space (9C, S) and a family (P x y ^ C 9C) of 
pr.’s on to every choice of the initial distribution P 0 there cor¬ 

responds a r.f, Xt on the pr, space (0, (B, P) with <S = (& (Xt) and P 

on (B defined by PB = J* P 0 (dx)P x B y 5 C We shall choose the point 

of view according to convenience. But first we have to install the ap¬ 
paratus of random analysis. 

38.1. Generalities. We examine possible complete definitions of r.f/s 
considered as mappings. We take them on some unspecified but fixed 
pr. space (12, (i, P). 

Analogy with random sequences yields the following 

(H-definition. A rj, Xt is a function on a set T to the space (R of r.v's 
on the pr. space. 

In general, T is some set in the Euclidean line + 00 ) or the 

compactified Euclidean line /? = [ 一 oc, + 00 ]* Unless otherwise stated, 
it will be so, and we shall denote by T the closure (or adherence) of T 
in the corresponding topology. 

The values Xt at / are r.v.’s, that is, finite measurable functions on 
the pr. space to the Borel line* We have encountered more general ran¬ 
dom elements such as complex-valued r.v.’s or random vectors. Their 
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common feature is that they take their values in (linear complete) sep¬ 
arable metric spaces 9C and the Borel sets in these spaces are topological 
Borel sets —— the elements of the <r-fields S generated by the topology (the 
class of open sets). What follows, either is directly applicable to such 
random elements or can be transposed without difficulty. Therefore，we 
shall denote the state space — the measurable space of values of each r.v. 
by (9C, S) but, to fix the ideas, 9C will be a Borel set in R or R y unless 
otherwise stated- 

In the range space (R we have at our disposal various types of limit 
based upon probability. However, in order that such limits, when they 
exist, be unique (that is, the corresponding topology be separated) we 
are led, as in the case of random sequences, to identify equivalent r.v.’s 
and, more generally, equivalent measurable functions: X and X are 
equivalent when X = X on N c ; N y with or without affixes^ will denote a 
null event. In turn, this identification already led us to a slight extension 
of the concept of r.v., to be considered either as a representative element 
only of a class of equivalence or as an a.s. defined, a.s. finite, real-valued 
measurable function on the pr. space. 

In the case of limits in pr. and in the rth mean, we know that the spaces of 
equivalence classes are linear complete metric spaces with the correspond¬ 
ing distances defined by d(X y Y) = E{\ X — Y |/(1 + | X — Y |) } and 
d(X y Y) = E\X - y| r or E llr \X - Y| r according asr < 1 orr^ 1 ; 
for r ^ 1 the spaces L r are Banach spaces with norm \\ X r = E llr \ X\ r . 
Classes of equivalence are partially ordered by the relation X ^ Y a.s. 
or, equivalently, g(X) ^ ^(Y) a.s. where g is some real-valued strictly 
increasing finite Borel function, otherwise arbitrary ； in particular, we 
can select a bounded and continuous function g y say, g = Arctan. Thus, 
whenever we are concerned with order relations or existence and unique¬ 
ness of limits along T y we may assume without loss of generality that 
our r.f. Xt is bounded, that is, all its values X t are uniformly bounded by 
some finite constant. 

At first sight, identification of equivalent r.v.’s leads to identification 
of equivalent r.f/s: We say that Xt and Xt are equivalent r.j:s 、 and 
write, Xt if X t = a.s., that is, Xt = X t on N t c y for every 

t d thus, the equivalence class of a r.f. Xt is characterized by the 
common law of all the elements of the class- When, moreover, the set 
N = (J N t is null, that is, X t = X t on for all / G we say that 

t€T 

Xt and Xt are a.s. equal r/. y s. When the set T is not countable, the 
foregoing union set may be nonnull or even nonmeasurable, so that 
equivalence of r,f/s does not imply their a.s. equality and a.s. equality 
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classes are subclasses of equivalence classes of r.f/s. Identification of 
equivalent r.f/s looks natural since the pr. space is to be but a frame of 
reference, that is, the pr. properties of r,f/s are to be describable in terms 
of their laws. Yet, while no difficulties arise in the case of random se¬ 
quences, in the general case we are faced with analytical properties of 
r.f/s which vary within the equivalence classes and thus are not describe 
able in terms of the common laws alone. For example, let T = [0, 1]， 
take for pr. space the Lebesgue interval [0, 1], and consider r.f/s % Xt 
defined as follows: X t (co) = 0 except at cot where Xt(cot) = 1； the co t can 
be varied in any way from one r.f. to another without altering their 

equivalence. Consider Y = sup X“ which takes at most two values 0 

ter 

and 1， and set A = [Y = \] = (co ty t G T). We can select the cot so that 
A be any set in For instance, = 0 and Y ^ 0 for the r.f. with 
every G 0, ^ ^ and Y = 1 for the r.f. with every co t = /, and Y is 
not measurable for a r.f. with ranging over a nonmeasurable set. 

Thus we are forced to back down and to limit ourselves to subclasses of 
equivalence classes of r.f/s, their choice to be based upon the require¬ 
ment that limits along T be measurable, hence the possibility of express¬ 
ing their analytical properties in terms of the common laws. Luckily 
such a choice is always possible within any equivalence class and the 
next subsection is devoted to a specific choice ― that of "separable” 
r.f/s, due to Doob. 

Instead of emphasizing the argument "/’’we may emphasize the argu¬ 
ment ‘V’ and thus consider r.f/s Xt as functions on fi to the space 
Xt = II % where the 9C^ are replicas of the range space 9C of the Xt. 

tCT 

The values Xt(co) at co of a r.f. Xt will be called sample functions or 
trajectories or paths of the r.f.; they are elements of the space 9Cr, that 
is, functions on T with values Xt{(xi) C However, we have to insure 
that the sections Xt = (Xt{co) y cu C at every / be r.v/s. This neces¬ 
sitates the introduction of the Borel field Sr in generated by the class 
of Borel cylinders C(St h ) whose bases Sr n are Borel sets in finite-dimen¬ 
sional subspaces 9Cr,‘ = II 9^; it suffices to take finite product bases 

t€Tn , 

S Tn = II of Borel sets S t in X t . But the class of cylinders with 

t€T n 

countably-dimensional Borel bases is contained in S^, contains the gen¬ 
erating class of cylinders, and is itself a <r-field, hence 

The Borel field Sr in X T coincides with the class of cylinders with count- 
ably-dimensional Borel bases. 
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The measurable space (9Cr, Sr) will be called the sample space of Xt. 

Sample definition. A r./. Xt is a measurable junction on a pr. space 
(S2, d y P) to a sample space (9Cr, §>t )； in symbols C d. 

If a r.f. Xt is so defined then, for every Borel set S t C 9C^ [X t G St] = 
XT^ l (C(St)) G <2 so that the Xt are r.v/s- Conversely, if according to 
the (R-definition the Xt are r.v.’s then, for every finite product Sr n = 
II S t of Borel sets A in 9C “ XT^ l (C(ST n )) = fl ^T^iCiSt)) G so 

t€T n t€T n 

that Xt is a r.f. according to the sample definition. Thus these defini¬ 
tions are equivalent. 

The (R-definition leads to analytical properties of r.f/s Xt in terms of 
limits in (R, say, Xt is continuous in pr. or a.s. or in the rth mean at t 

, P a.s. r 

according as Xt. — X t or X t . — > Xt or X” ^ X t as 〆 一 》/; we 
drop “at /” when the property holds for all /. The sample definition 
leads to analytical properties of r.f/s Xt in terms of those of their sample 
functions Xr{^)y say, Xt is sample continuous or sample measurable or 
sample integrable at coif the property is true for Xt(oo )； we replace “at co” 
by "on A yy when the property holds for all co C drop it when A = 
and say that the sample property is almost sure (a.s.) 9 or holds for almost 
all sample functions, when A = N c . 

In general, analytical properties of a r.f. relative to the sample space 
are "finer” than the corresponding properties relative to the space (H of 
r.v/s. Let us consider a.s, continuity properties: a.s. sample continuity 
of Xt implies a.s. continuity of Xt but the converse is not necessarily 
true. For, as t f —» /, in the first case » Xt on N c where the null 
event iV is independent of /, while in the second case Xt ， — X t on Nt 
where the null events N t may depend upon / and (J N t may not be a 

t€T 

null event or may even not be an event. 

It may be convenient to describe a.s. continuity and (a.s.) sample 
continuity in negative terms. When X” does not converge a.s. to Xt 
as t f /, we say that / is a fixed discontinuity point of Xt* A discon¬ 
tinuity point / = /(o>) of Xt{<^) which is not a fixed discontinuity point 
of Xt will be called a moving (with co) discontinuity point of Xt* Thus 
a.s. continuity of Xt means no fixed discontinuity points while (a.s.) 
sample continuity of Xt means neither fixed nor (outside a null event) 
moving discontinuity points. 

An advantage of the sample definition is that it introduces directly 
the onfield of events induced by the r.f. Xt —the 

union tr-field of* the <r-fields (S>(X t ) induced by the r.v.’s Xf As long as 
we are concerned with the properties of Xt alone, the only events and 
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(not necessarily finite) r.v/s we have to consider are those defined on Xt 、 
that is, events belonging to (S>(Xt) and (B(^r)-measurable r.v.’s. Since 
the Borel sets in the sample space are cylinders with countably-dimen- 
sional Borel bases Sr e and Xr^iCiSr^) = (^T e )y events defined 

on Xt are defined on countable sections Xt c of Xt (those sections vary¬ 
ing in general with the events). More generally 

a. Countability lemma. A r.v. ^ is defined on Xt if and only if it is 
a Borel function g of Xt ，in fact 9 of a countable section Xt € of Xt* In 
particular^ if ^ ^ E(Y | Xt) a.s. then ^ = £(Y | Xt c ) a.s. 

Proof. If ^ = g(Xr) then 厂 1 = implies that (B(^) C (S>{Xt) 

and the “if” assertion is proved. r ^ 

Conversely, let CB(^) C (& (Xt) so that the events A n k = 一 ^ ^ < 

^71 

是 + 1 • 

- for k finite and = = — °o] or = +oo] for k = — oo or 

+ 00 , belong to (& (Xt)- By the sample definition there exist Borel sets 
S n k G Sr such that A n k ― and, since the events A n k are dis¬ 

joint in k y we also have A n k = Xt 一 1 [S’ n k) where the Borel sets S f n k — 

... • k 

S n k( U S n j) c are disjoint in k. The functions are Borel 

k 2 n 

functions on 9Cr and gn(Xr) — ^ on the range = AV ⑼ of* Xt* Thus, 
g = lim g n exists on some Borel set 6* ID and setting, say，^ = 0 on 
S c y we obtain a Borel function g on Xt with $ = g(Xr) . This proves 
the “only if” assertion. 

The “countable section” assertion follows from what precedes since 
the events A n k are defined on countable sections of Xt ； or it suffices to 
observe that the events < r] where r varies over the rationals generate 
(B ( 芒） and, every < r] being defined on a countable section Xt^ the 
r.v. ^ is defined on the countable section Xt c where 7" c = (J T r . In 

r 

particular, if ^ = E(Y | Xt) a.s. then ^ = E(Y | Xr e ) a.s. upon condi¬ 
tioning by Xt c - The proof is terminated. 

We may emphasize the argument (w, /) and consider r.f/s as functions 
on 12 X T 1 to 9C with values X(co y /). But then we have to insure that 
sections 不 =/), w C at every t be r.v.'s, and this brings us back 
to the previous definitions. Yet, the present interpretation leads to an 
important type of rX as follows. Let 3 be some cr-field in T. 

Measurable r.f. definition. An (GL X ^-measurable r/. is a 
measurable function on (Q X T, <i X 3) to the state space (9C, S). 
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Since every section at / of an (a X 3)-measurable function Xt is a« 
measurable, it follows that the sections X t are Ct-measurable; similarly 
for 3-measurability of sections Xt(co) at co ： 

(Gi X : immeasurable r.f's are r,f:s and their sample functions are 3- 
measurable. 

A measurable Xt is a Bor el r.f. if Tis a Borel set and 3 is the a-field of 
Borel sets in T. Then we introduce on 3 the Lebesgue measure X (that 
is, its restriction to Borel measure on 3). If Xt coincides with a Borel 
r • 广 outside a (P X X)-null set, we say that Xt is an a.e. Borel r.f. 
We emphasize that measurability of r.f.’s is relative to the product o-- 
field (i X 3 and not, as usual, to the completion of this onfield with 
respect to the product measure P X m where ji is a measure on 3. 

The importance of measurable r.f*.’s is due to the fact that, as we shall see 
later, under some continuity conditions a r.f. is equivalent to a measur» 
able one, and to the following immediate consequences of measurability 
and of Fubini’s theorem. 


b. Measurability lemma. If Xt is an (d X ^-measurable r.f. then 


the sample functions are ^-measurable and，for Xt ^ 0 or 
<oo where ji is a a-finite measure on 3, 



E\ Xt I 



If Xt is a Borel r.f. then、for every r.v. r with range in T, the function 
X T = (Uw), a) € fi) is a r.v. 

Application. Let Xt be a Borel or, more generally, an a.e. Borel 
stationary r.f. with T = [0, oo )； stationarity mdans as usual that 
£(X tl} • • •, X tn ) = £(X tl ^hy • • •, ^t n ^h) for all finite subsets (t u • • •, 
/„) C T and all h > 0, It follows at once that if a r.v. ^ is defined on 
Xt and 以 is its translate by h y then 芒 and ^ have the same distribution. 

Let E\ Xo\ r be finite for some r ^ 1 so that the r.v:’s Y n = j X 8 ds 

广 n n ~* 1 

and Z n = I \ X a \ ds exist and EY n = EXo ，， EZ n = E\ X 0 |, 

J n 一 1 

E\Y n \ r ^ EZ n r ^E\X 0 \ r are finite. Since the sequences are sta- 

Q S 

tionary, it follows that ('+•••+ Y n )/ri Fand (Zx + •••+ Z n )/n 
—> 2 hence Z n jn ― ^4 0. Therefore, if w = ti t is the largest integer 

r r 
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contained in t then, as / — 00 ， 

Ut = -f X s ds = - (- f X t i/s + - f X s ds) ^4 V, 

/ / \n J 0 n J n / r 

since w// 一 1, the first divided integral in parentheses is (Y\ H - h Y n )/n 

and the second one is bounded by Z n+ i/n. Finally, if e is the a-field 
of invariant (under translations) events on Xt then, for every C G C, 


I E e X 0 dP = I X 0 dP = \ X s dP = \ U t dP ^\Y dP 

J c J c J c J C •/ c 

so that Y = E e Xo a,s. Thus 

a. R.F.’s stationarity theorem. 1/ X[ 0 ^) is a stationary a.s. Bore/ 
”•/• with E\ Xo | r < ① for an r ^ 1, then 



E e X 0 


where Q is the <r-field of invariant events on Xt- 

38.2. Separability. Let Xt = (X“ i €1 T) be a r«f, with domain T. 


Let T be the closure of T and set l n 
for every l C. T y 



By definition, 


Xt = liminf Xt» = sup inf X” 

〆 一 ► t u 

Xt = limsup = inf sup X t ，. 

t f t n t f ^,1 n T 


Thus, to every r.f. Xt on T there correspond two limit functions Xt 
and X T on T y respectively, lower and upper semi-continuous. Since the 
limits are taken using nondeleted neighborhoods, 

X h t G T y 

and Xt = X t = X t at every isolated point t C. T. Since the X t are a.s. 
finite, it follows that Xt < + 00 a.s., X t > 一 00 a.s. Also 

If lim X t t exists at t d then it coincides with X t , 

t 〜 t 

The basic difficulty consists in that the functions Xt and of co G 
may not be measurable, for then analytical properties of the r.f. Xt 
would not be expressible in probability terms. Since these functions of 
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a? are formed by means of sequences of extrema of the X t on sets of the 
form IT where the I are open intervals, we are led to require that all 
such extrema be measurable. This requirement is automatically ful¬ 
filled when T is countable and would be fulfilled were it possible to re¬ 
place T by some fixed countable subset S oi T\n the formation of these 
extrema. If there is such a set S y we say that the r.f. Xt is separable and 
is separated by S — a separating set of Xt. In fact 

A. Separability criteria. The following properties are equivalent 
奶 d define separability of a rj. Xt: There exists a countable subset 
S = {jy} of T such that 

― -for every open interval I whose intersection with T is not empty 
(Sj) inf X s = inf X ty sup X t = sup X 3 

Sj€is tCIT tCIT 3jZIS 1 

(5 2 ) inf X Sj ^ inf X ty sup X t ^ sup X s 

€is tCIT t€IT sjCIS J 

(5 3 ) inf X Sj sup X 3i , t G IT 

SjCIS sjCIS 3 

—for every / C 7* 

(S ； i) liminf X 3 = liminf X t ^ limsup X t » = limsup X s 

— t t' t t* — t Sj — t ’ 

(S、）liminf X 8 S liminf X t ^ limsup X t » ^ limsup X 3 

t t* 一 t t* ^ t 3j^ t ’ 

(S’ 3 ) liminf X s ^ X t ^ limsup X s y t Q T. 

t 8j— t 

Proof. The three nonprimed properties are equivalent: For, (S^ 

(S 2 ) while (S 2 ) ==> (Si) since for <9 C 7" the reverse inequalities are al¬ 
ways true, and (S 2 ) ==» (S 3 ) while (S 3 ) =» (S 2 ) upon taking extrema 
over / C IT. 

The three primed properties are equivalent: For, (S’i) => (S^) while 
(S〜)=> (S’i) since for S C ： T the reverse inequalities are always true, 
and (S’ 2 ) =» (S’ 3 ) jvhile (S’ 3 ) ==» (S’ 2 ) upon replacing / by t f in (S^) 
letting 〆 —/ C IT and using the semi-continuity of inferior and supe¬ 
rior limits. 

It follows that the primed and unprimed properties are equivalent: 

For ， (S 2 ) => (S’ 2 ) upon replacing I by I n = it — and letting 

\ n n/ 

” 一 ^ 00 ， and (S’ 3 ) => (S 3 ) since / n C / from some n on for any fixed 




172 FOUNDATIONS ； MARTINGALES AND DECOMPOSABILITY [Sec. 38] 


t C IT so that the extreme terms of (S 3 ) are then farther away from X t 
than those of (S^), The proof is terminated. 

The use of separable r.f.’s is justified in the sense that every r.f. is 
equivalent to a separable one. In fact, more is true, as follows. 

Separability criterion (Si) means that for every open interval I and 
every closed interval C 

(S) [X t c c y tc IT] = [X 3 . c c, iy c IS](C a), 

that is, separability as defined is separability for closed intervals. This in¬ 
terpretation leads to the general concept of separability for sets of a 
given class. In particular, if the above equality holds for all closed sets 
C y the r.f. is separable for closed sets; it is then a fortiori separable (for 
closed intervals). Yet, given any r.f. AV, there exists an equivalent r.f. 
separable for closed sets, because of the following 

a. Separability lemma. Given a rj. Xt and a Borel set C in the state 
space y there exists a countable subset S = \sj) of T such that for all t T 
the intersections A t = [X s . C C y Sj C C C] are null events. 

In jact、given a class Q of countable intersections of a countable class {C 々 } 
of Bore! sets y there exists a countable subset S of T such that the intersections 
At are subsets of null events Nt which do not vary with C C C- 

Proof. Set A nt = [X 3k G C, ^ ^ n)[X t C] and B n = A n3n ^ The 
PA n t and p n = sup PA nt form nonincreasing sequences and the B n be- 

tCT 

ing disjoint S 1 hence PB n — 0. Select s x Q T arbitrarily and, 

for every n such that p n > 0, select so that PB n ^ ^1 - ) p n . 

If p n = 0 the asserted set is (s u • • . ， j n )，and if no p n vanishes the as¬ 
serted set is (ji, s 2y …） since 

PAt ^ PA n i ^ p n ^ PB n j ^1 - ) — > 0. 

If C C C is intersection of some sets of a countable class {Cjt}, de¬ 
note by A t k and the sets A t and S relative to Ck- Then the set 
S=\JS k is a countable subset of T and the event N\ = A t k is null. 

Since [X t ^C](Z\J[X t ^r Ckf ) and G C, sj C C C^] C 

k f 

Atk* C Nty the event [X 9 ^ C C y Sj C ^][^t C C】is a subset of N ty and 
the proof is terminated. 
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B* Separability existence theorem. Given a rj. Xt 、there exists a 
separable for closed sets r.f. Xt equivalent to Xt and defined on it. 

Proof. We seek a r.f. Xt equivalent to Xt with (S>{Xt) C CB ( 而) ， 
and a countable subset S of T such that for every open interval I (with 
IT 9^ 0) and every closed set C 

[X Sj C C, c IS] = G C, / G m 

Since the left side event contains the right side one，it suffices that it be 
contained in the right side one. Since every open interval / is a count¬ 
able union of open intervals I r with rational or infinite endpoints，it 
suffices to realize this property simultaneously for all I r . 

The class of closed sets is contained in the class Q of countable inter¬ 
sections C of finite unions Ck of open or closed intervals with rational or 
infinite endpoints, so that we can apply the separability lemma with T 
replaced by I r T y S by I r S y and N t by N tr . Then the set S = U S r is a 

r 

countable subset of T y the event N t — \J N tr is null and is the same 

r 

for all C and all I r and, for / C IrT } 

[^sj C Cy Sj G I r ^][Xt C C] c Nf 

Given I n let C r (o?) be the closure of the set Sj C / r ^} in the 

closure 9C C of the state space. The set C t (o)) = H C r (oj) is non- 

Ir3t ~ 

empty and closed，and X t (o)) G Ct(o)) for a? N t . We set = 

X t (o)) for / G a? G ^ and for / C ^ and we set ， say ， 兄 (a?)= 
liminf X 3j (o)) for t ^ S y a? C N t . Thus Xr is equivalent to Xt with 

3 j ► t 

CB(J?r) C ©( 而 ） and C G(o)) for every o) and /. Given C C 

if a? is such that X a .(o)) G C for all Sj C I r S 、 then G(oj) C C for all 
/ C IrS and, by definition of Xr, X^(o)) C C for all / C I r T. Therefore 
the set C C, Sj C is contained in the set {X/(co) C C, 

/ C 人 7 1 }，hence coincides with it. The proof is terminated. 

Separability implications. Separability implies properties of sep¬ 
arating sets and of one-sided limits, and it will be convenient to weaken 
slightly this concept, as follows. 

1° A.s. separability. The equivalent definitions of separability 
were assumed to hold on Yet, the proof of their equivalence remains 
valid when 0 is replaced by any fixed event. If, in particular, they hold 
outside some fixed null event, we say that the r.f ‘ Xt is a.s. separable 
and continue to say that S separates it. When Xt is a.s. separable so is 
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every r.f. a.s. equal to it, and there always exists a separable r.f. a.s. 
equal to Xt —it suffices to change Xt to a constant on the exceptional 
null event. Thus 

In the study of a.s. properties of a given r./., a.s. separability may be re¬ 
placed by separability without loss of generality. 

Note that in A 

If the separability criteria in terms of open intervals I hold outside null 
events N{I) which may vary with I’ then the rj. is a.s. separable. 

For then, they hold for all I outside the fixed null event — |J N(I r ) 

r 

where the l r are all open intervals with rational or infinite endpoints. 

2 ° Separating sets. Since every / C 7 1 has to be a limit along any 
separating set of a separable r.f. Xr y such sets are to be dense in T and, 
in particular, contain the set of isolated points of T. On account of 
(S 2 ), the union of a separating set with any countable subset of T is also 
a separating set. Therefore, 

A separable r.f. Xt is separated by a sufficiently large countable subset 
of T dense in it and by any larger countable subset of T. 

Thus, whenever convenient, we may include within a separating set a 
suitable countable set, say, that of all rationals or of all dyadic numbers 
belonging to T. Note that 

If the separability criteria in terms of limits hold outside null events Nt 
which may depend upon t and the r.f. Xt is known to be a.s. separable，then 
the set S therein separates Xt* 

For, (S’ 3 ) outside N t implies (S 3 ) outside N ty while a.s. separability im¬ 
plies (S 2 ) outside a fixed null event iV 7 with a separating set S ; = {/fc} 

(in lieu of S). Therefore, outside the fixed null event N = N’ [j (\J N s ， k )， 

k 

inf X 9i S inf X $fk ^ inf X h 

sjCIS 8 f k CIS f t€IT 

and similarly for the suprema. 

3° One-sided limits. Separability allows us to replace two-sided 
limits along the domain of a r.f. by two-sided limits along a suitable 
countable set. The question arises whether the same is true of one-sided 

limits which are defined like the two-sided ones but with I f n — 
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for left limits and I n n = 



for right limits, in lieu of I n = 



The answer is as follows. 


Let Xt be a r/. separated by S ^ {jy} and let t be a left limit point of T. 
There exists in S a sequence t n \ t such that、outside a null event N ti 

Xt—Q = YimmfXr = Wminf X tny H t — Q = limsup X v = : limsup X tn . 

之一 0 t 〆 一 ► 卜 0 tn 1 t 

In particular ， //lim X tn exists a.s. for every sequence t n | /, then = 

lim Xt. exists a.s. even when the exceptional null events vary with the 
— 卜 o 

sequence. 

Similarly for right limit points of T. 

It will suffice to prove the general assertion for, say, a left limit point /. 
We may assume that the r.f. is bounded (replacing, if necessary, Xt by 
Arctan Xr) y so that the limits are finite. Because of separability there 
exist finite subsets S nm = (j n ^, k S of S in I’ n T such that 

P[Y nm — > l/n] < 1/w, P[Z n — Z nm > 1/w] < \/n 

where 

Y n = inf Xt* ^ ^nm = X Snk 、 
t f €In T k 

Z nrn = sup X 9nJc ^ Z n = sup Xt* • 

k fCI.nT 

The elements of all these subsets can be reordered into a sequence t n f /, 
for they are all less than / and for every n only a finite number of them 

are less than / — Since all ^ nm C I f n S y it follows that 

n 

Y n ^ Y f n = inf ^ Y nm 

tkd f n s 

so that P[Y f n — y n > l/n] <\/n and 


0 ^ r n - y n liminf^ n - Xt^o. 

UU 

Thus, Xt^o = liminf X tn outside a null event N f t and similarly X t ^o = 

一 • • • 
limsup X tn outside a null event N f, t y so that both equalities hold outside 

fn 1 ^ 

N t = N\ U N n t and the general assertion is proved. 
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If* the one-sided limits g t ±o of a numerical function exist and are finite 
but unequal, the function is said to have a simple (or first kind) discon¬ 
tinuity at /. If, moreover, g t lies between the one-sided limits (in¬ 
clusively), the discontinuity is said to be a jump. 

If d rj. Xt is ’separable then the simple discontinuities of almost all its 
sample functions are jumps except perhaps at fixed discontinuity points ， 
and if Xt is separable for closed sets then at these jumps the sample junc¬ 
tions are left or right continuous. 

Let ^ = {j ；} separate Xt and use separability relation (S) for closed 
intervals and for closed sets with all open intervals 7 3/ where t ^ S 
is a simple discontinuity point of Xt(o))- Upon taking the closed interval 
with endpoints X t±0 (o)) y (S) implies that X t (o)) belongs to the interval 
so that / is a j^imp point. Upon observing that these endpoints are the 
common limit points of all sets (^.(o)), Sj C IS) y separability for closed 
sets implies that X t (o)) coincides with one of these common limit points. 
Thus the assertions are true for all sample functions except perhaps at 
the Sj. At those Sj which are not fixed discontinuity points the sample 
functions Xt(co) are continuous for o>’s outside null events N Sjy hence 
outside their null union. The assertions are proved. 

Random analysis is based upon the fact that limits along T for a r.f. 
Xt separated by S are limits along S *—necessarily measurable. The im¬ 
mediate problem is that of finding separating sets. There is no difficulty 
whenever the r.f. is continuous in pr •: 

C. Continuity separation theorem. Let Xt be an a.s separable r.f. 
continuous in pr. or a.s. Then every countable subset dense in T separates 
Xt- Let I = [a y b] be intervals with endpoints in T and let S n = 
k ^ k n ) form sequences of finite subsets of IT becoming dense in it: 
sup inf I / — s n k 0. Then 

tciT k 

inf X Snk ^ inf X h sup X 3rjk ^ sup X t 

k tCIT k t£IT 

and if Xt is continuous a.s. y then the convergence is a.s. y while if S\ C ^2 
C • • • then the convergence is a.s. monotone. 

Proof. Let S = {sj} be a subset of T dense in it. Convergence in 
pr. along 6* is implied by convergence a.s. and implies convergence a.s. 
of a subsequence. Thus, if Sj t then there exists a subsequence 
sy — t such that X s 厂 — X t outside a null event N t so that on N t c 

liminf X Sj ^ X t ^ limsup X 3j . 

Sj —► t 3j —► t 
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The separation assertion results then from separability implications 2 ° 
for separating sets and implies the pointwise convergence assertion. 

It remains to prove the convergence assertions. It will suffice to do 
so for, say, the infima. We can assume Xit bounded. Let ^ = UA 
with a 、 b 〔 S denote now a separating set of Xi T and set 

Y = inf X h Y n = inf X 3nky Z m = inf X sj . 

t^lIT k j 

Since Z m 1 Y a.s. as w — oo, it follows that for every € > 0 , as w ^ oo 
then m —> oo ， 

P[^n 一 Y ^ 2e] ^ P\Yn - Z m g €】 + P[Z m 一 Y ^ e] — > 0 ， 

so that Y n Y. If Xt is continuous a.s. hence limsup Y n ^ X Sj out¬ 

side null events N 3j then, outside their null union, limsup Y n ^ Y while 

Y n ^ Y y so that Y n —> Y. The proof is terminated. 

In fact, less than continuity in pr. is required provided a suitable 
countable subset is excluded，because of 

D. Continuity extension theorem. Let Xt be an a.s. separable rj. 
such that limits in pr. (in the rth mean) X t ^ 0 or X t ^o exist for every 
/ C C 7". Then there exists a countable subset T c of V such that for 
^very t C T’ 一 T Ci X t ^o = Xt^o and = Xtfor t C outside a null event 
In particular, if T f = T then Xt is continuous in pr. {in the rth 
mean) on T 一 T Ci and ij T’ ； T then Xt extends on T — T c to a r.f. 
continuous in pr. (in the rth mean) and determined up to an equivalence. 

Proof. It suffices to prove the general assertion with convergence in 
pr. The space of equivalence classes of r.v/s on our pr ‘ space，with 
distance d{X 、 y) = £(| ^ — Y \/l \ X — Y |) is a complete metric 

space in which convergence in distance is equivalent to convergence in 
pr. The family of r.f/s equivalent to Xt is a function on T to this 
space，and the general assertion reduces to a classical one about functions 

which take their values in a complete metric space, proved as follows. 

At least one of the one-sided limits ^t±o has no meaning at isolated 
or one-sided limit points of T((Z R). But their set is countable and thus 
may be included in V provided it is also included in the asserted excep¬ 
tional countable set T c * Thus it suffices to consider two-sided limit 
points of T belonging to T f . But the set 7 v n of such points and at which 
the oscillation of 仏 is at least \/n is countable for, by hypothesis, if 
/ C T f n then at least one of the one-sided limits, say , 专 t _o exists, so that 
/ is right endpoint of an interval containing no other points of T f n . Thus 
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we include the countable set T f n in T c . The remaining set T f — T c is 
that of two-sided limit points at which the oscillation of vanishes, and 
the general assertion follows. The proof is terminated. 


Continuous numerical functions on a Borel set in R are Borel func¬ 
tions. The question arises whether similar properties hold for r.fVs Xt 
with some continuity on T based upon pr. The answer is in the affirm¬ 
ative in the following sense. 

Let 3 be the <r-field of Borel sets in a Borel set T and 入 the Lebesgue 
measure on 3. 


E. Measurability theorem. Let Xt be a rj. with a Borel set T. If 
Xt is a.s. continuous and separable、then it is an a.e. Borel rj. and the 
discontinuity sets of almost all its sample functions are \-null. If Xt is 
continuous in pr” then there exists an equivalent a.e. Borel r/. separable 
for closed sets. 


Proof. 1。 Let T k in) 


1 k 


= - , 


Ty k = Oy 士 1 ，士 2， • • • ， and 


form the infimum Xk {n) and the supremum Xk W over each nonempty 
T k (n \ Set Xt w = E Xk W W n 、， ^r (n) = Z and note that 

k k 


Xt] Xt w X T {n) 1 It. 

Since Xr {n) and Xr (n) are Borel r.f.’s so are their limits Xt and Xr y hence 
the set U = [(co ， /):Kw)= 又 (w)] is a X 3 -measurable and, by the 
preceding inequality Kco) = X t (o)) = for (oj, L c - 

If Xt is a.s. continuous then X t (o)) = 不 (o>) = Xt(o)) for o) ^ N ty 
t d hence the (ft-measurable) sections L t C N t are null events. It 
follows that LAs (P X 入 ) -null, hence almost all its sections are 入 -null. 
Thus Xt is an a.e. Borel r.f. and the discontinuity sets of almost all its 
sample functions are 入 -null. The first assertion is proved. 

2 ° It suffices to prove the second assertion for Xt separated for 
closed sets by ^ = {sj\ (on account of the separability theorem and the 
fact that equivalence preserves continuity in pr.) with Xt and Abounded 
(replace, if necessary, Xt by Arctan Xt and transform T into a bounded 

set, say, by — 一2 — 一 for x K x f = x for — 1 ^ ^ ^ 1 , 
^ = 2 — - for ^ > 1). Reorder the n first Sj into ^i (n) 〈… < ^n (n) 

x 

and set s Q {n) = —oo, 7\ (n) = [^-~i (n) , s k (n) )T. 
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The y r (n) = E ^(n)/ r ,(n) are Borel r.f/s and, by continuity in pr .， 

k 

y/ n) —> Xt hence Y t ^ — Y^ n) 0 as w, ^ —> oo. Since T and the 
sequence Yr (n) are bounded it follows, by the Fubini theorem，that 



UXT 


- y/ n) (co) I d{P\){^t) = (e\ Y t (m) - Y t {n) \d\{t) 

J T 


A fortiori y Yt { u) > Yt where Yt is a Borel r.f. and there is a sub¬ 
sequence Yt ( ti) Yt outside a (P X X)-null set M. It follows that 

Yt } —> Y t for all / outside a 入 -null set To into which we can and do 

include the countable separating set S. But，by hypothesis, Y t {n，) 

X t and therefore X t = Y t outside a null event N t for every / C 了 — TV 
Thus, the a.e. Borel r.f. Xt with values y^(o?) for (a?, /) C M c 门 （0 X 
{T -t T 0 )) and values X t (o)) elsewhere is equivalent to Xt- Since we 
included S in Tq so that ― Xs and outside the section M t every Y t 
is limit of X 3j while on these sections = X h it follows that the set S 
which separates Xt for closed sets does the same for 叉 t. The proof is 
terminated. 


Variants. 1° The theorem remains valid when continuity in pr. or 
a.s. holds outside a X-null Borel subset of T: it suffices to increase the 
exceptional (P X 入 ) -null set. 

2° The theorem remains valid when continuity in pr. is one-sided. 
The continuity extension theorem reduces it to 1°, 

3° The theorem and its foregoing variants remain valid when “Borel 
set T" is replaced by “Lebesgue set T” : proceed as for 1°. 

4° The Borel r.f.’s of the theorem are a.e. limits of sample continuous, 
in fact, of sample polygonal r.f.'s: We may replace the approximating 
simple measurable r.f.'s by polygonal ones with vertices Xk( n \ ^Gt (7l) ， 
or X Sk ^ n) . 

38.3. Sample continuity. The presence in r.f.’s of two arguments o) 
and / and the requirements of uniqueness then of measurability in co of 
limits in / led us to equivalence classes of r.f.’s then to a partial retreat 
to their never empty subclasses of separable r.f/s. The a priori weakest 
type of limit in / of r.v.’s X ty based upon pr. and yielding r.v.’s，is the 
limit in pr. Random analysis is concerned primarily with separable 
r.f/s Xt such that one-sided limits in pr. exist on T and hence, by the 
continuity extension lemma，with r.f/s continuous in pr, outside count¬ 
able subsets of T. When investigating specific types of r.f/s, our first 
problem will thus be that of existence of one-sided limits in pr. or a.s. 
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But the next problem and，in fact, the essential problem of random 
analysis is that of the behavior of sample functions and especially that 
of the a.s. sample behavior，that is，behavior common to almost all 
sample functions. In particular, it seeks conditions for some kind of 
continuity: a.s. sample continuity, or a.s. sample continuity except 
for simple discontinuities (that is, except for jumps 一 because of sep¬ 
arability) or except for nonsimple discontinuities. In this subsection we 
are concerned with some general conditions for such kinds of behavior 
when the types of r.f/s are not specified. 

In general, pointwise continuity or continuity based upon pr, is re¬ 
quired or reduced to continuity on compact domains — when it becomes 
uniform. For any type “c” of convergence, Xt is ^z"-unijormly con- 

ttnuous 'if Xr 一 — 0 uniformly in7 C as t f /. For ordinary 
pointwise convergence hence for sample functions, the classical fact that 
continuity is uniform on compact domains is due to the Heine-Borel 
characterization of compact sets and to the equivalence of convergence 
and mutual convergence — which is also true of convergences based upon 
pr. Thus 

a. Uniform continuity lemma. For separable rj's on compact 
domains、continuity of a sample junction and continuities based upon pr. 
are uniform. 

In what follows T will be compact so that there will be no distinction 
between continuity and uniform continuity. To simplify the writing we 
shall take T = [a y b\ and, whenever convenient, replace it by [0, 1] — 
which does not restrict the generality. The reader is invited to rewrite 
the results for any compact T and examine their validity for noncompact 
T. 

Let a(I) = sup I X v - X t n I be the oscillation of Xt on an interval 

t\t v CIT 

I and let /3(/) = inf {a (/ H ( —oo,/)) + a(/ 门 （/，+ 00 ))} be the “left- 

tCIT # 

right” oscillation of Xt on I. Given t C let I\ 〕 / 之 〕 • . • be a nonin¬ 
creasing sequence of intervals converging to {/} and to which / is interior- 
Then a(I n ) converges to the oscillation at of Xt at / and every sample 
function Xr(a)) for which a t (o)) = 0 is continuous at /, and conversely- 
Similarly 々 (/，,) converges to the “left-right” oscillation of Xr at t and 
every sample function Xt{^) for which (⑴) = 0 is free of nonsimple 
discontinuity at and conversely. Finally, if X^ n (^) — 不 "〆 oj) — 0 for 
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some t* n I / and t n n 1 / then the sample function Xt{^) cannot have a 
simple discontinuity at t. 

A. Sample continuity theorem. Let a r.f. Xr y T = [a y b\ be sep¬ 
arable for closed sets y let a y b ^S n = {s nU • • • ， s nkt ) t ^ dense in T y and 
let I n k =[ 』 nA: ，』 n ， A+l】* 

If P[max a(I n k) ^ c] 0 for every € > 0, then Xt is a.s. sample con- 

k 

tinuous. 

If P[max & {Ink) = € ] 0/or every e > 0 y then Xt is a.s. sample con- 

k * 

tinuous except perhaps for simple discontinuities. 

If 戶 [max I X 9nfk - X 9ntk+l I ^ e] 0 for every € > 0, then X T is a.s. 

k 

sample continuous except perhaps for non simple discontinuities. 

The /3(/) are assumed measurable* Otherwise, in their definition re¬ 
place T by S y or in the corresponding condition replace P by its outer 
extension. 

Proof 、 If p n = 尸 [max a(/ n jt) ^ e] —^ 0 then there exists a sub- 

k 

sequence n f such that ^2p n f < 00 • It follows, by the Borel-Cantelli 
lemma that there exists a finite integer-valued r,v. v € such that max <x(I n ， k) 

k 

^ € outside a fixed null event N € for all n f ^ But, given / C [ 沒，彡]， 
there exists a subsequence I n .k converging nonincreasingly to {/} so that 
a t ^ e and e > 0 being arbitrary, a t = 0. For, if / C ^ then it is in¬ 
terior to the I n ，k and if / C ^ then for n f sufficiently large it suffices to 
replace € by 2e and I n ，k by its union with the other interval of which / is 
the endpoint, unless t = a or b —•in which case the asserted continuity is 
automatically one-sided- 

Similarly for the other assertions, and the proof is terminated. 

Particular cases. 1° Let the suprema be taken over all intervals 
[/， / + in [a y b] and o € {h)/h —^ 0 as A 0 for every € > 0. 

7/sup P[a[t y / + 々 ]$€] = o € {h) ^rsup P[^[t y / + 々 ]$€] = o € (h) y then 
the rj. is a.s. sample continuous or a.s. sample continuous except perhaps 
for simple discontinuities. 

If sup P[\ X t +h 一不 I 2 €] = o € (h) then the r.f. is a,f. sample con¬ 
tinuous^ except perhaps for non simple discontinuities. 

Replace [a 9 b] by [0, 1], take for S the set of dyadic numbers kh ny h n = 
2 一 ' and note that 


P[max a(I nfc ) ^ ^ X) PW(Ink) ^ ^ o € (h n )/h ny 
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and similarly for the other assertions. The first proposition is due to 
Dynkin and the second one to Dobrushin, 

2° In the case of consecutive sums of r.v/s, convergence in pr. en¬ 
tailed a,s. convergence under conditions which for T = [a y b\ take the 
form 

(C) ptAh)P[ sup I AV —不 I g g(e)] S P[\ Xt^h 一不 I — €] for 
every [/，/ + 々 ]C T and every € > 0, with g(e) —^ 0 as e —> 0. 

Inequalities of this type were obtained in the case of sums 不 … 
by a procedure which, similarly, yields the following property: 

If P(| X n — Xk I < c I A'i, • • Xk) ^ a.s.y k = \ y * * •, then 
p € P[max \ X k \ ^ 2e] ^ P[\ X n \ ^ e], 

k 

For, setting A k = [ Xk \ ^ 2e, max | Xj | < 2c] (with X 0 = 0) and 

J 

B k = [| X n - A I <e] so that T, = [max | ^ | ^ 2c] and | X n 

k 

^ € on the we have 

P[| I ^ ^ E PA k B k = E E{I Ak E{Ib k 1^1, - • ， ^ 0) PA k . 

k k k 

Upon setting Xk = X ik — X t it follows, by separability and the usual 
limiting procedure, that 

// 

(C) P(\ X t ^ h 一不 J < € I - 石， … ，不 ，一不 ） ^ pM a.s. y 

走 = 1， •. . ， n y for every finite subset t\ < … < t n in every (/, / + A) C T*, 

then (C) holds with g(e) > 2e. 

Therefore, on account of 1° and 

[sup I X t * — X t n I ^ 2^-( e)] C [ sup \ X t ^ — X t \^《(€)]， 

Under (C) or (C ’）with p“{h) 1 p t 、 t > 0 for h sufficiently small, the r.f. is 
{left 、 right) a.s. continuous if and only if it is [left 、 right) continuous in pr” 
and if p € ,t ^ > 0 then supPjj X t ^h 一不丨 ^ | = o t {h) entails a ， s 、 

sample continuity. 

In its turn, (C’）is implied by 

(C ,; ) P( Xt+h — Xt k I < € I Xt 0 • • •, ^tf) = a.s. for every finite 

subset /i < • • • < / n in every [/, t + h) CL T. 
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Then, taking n = l and t\ = /, it follows that 

P[\ X t ^ h - 石丨 2 6] = EP(\ X t ^ h - X t \^ e\X t ) - pUh) 

and hence 

Under (C") with 1 — p t% t{h) ^ o € (h) y the r.f. is a.s. sample continuous. 

Conditions on moduli of continuity in pr. yield moduli of sample 
continuity, as follows: 

Let g{h) with I h \ ^ ho be an even, nondecreasing \n h > 0 function 
such that g(h) —^ 0 as ^ ^ 0. Let h n = q— n , q > I integer. 

B. Sample continuity moduli theorem. Let Xt 、T = [a y b\ be a 
separable rj, such that、for every /, / + ^ G T y 

P[\ X t+h 一不 I g g(h)] ^ q{h) ^ 0 as h^O. 

(i) If 53 q{h n )/h n < oo and J^g(h n ) < oo, then Xt is a.s. sample con¬ 
tinuous. 

00 

(ii) If Y,q{jhn)/h n < ^ for every integer j and Ys g{h n ^ r )/g{h n ) ^ 

r=l 

a < oo for all n y then Xt is a.s, sample continuous and there exists an a.s. 
positive r,v. H and a finite constant c such that 

I X t+h -不 I < cg{h\ \h\<H. 

I/y moreover y gih n ) /g[jh n ) is arbitrarily small for n^j sufficiently large ， then 
for every € > 0 there exists an a.s. positive r.v. H t such that 

X t +h — X t I < (1 + e)g(h) y I ^ I < 

Proof. To simplify the writing we take T = [0, 1]. Since the 
hypotheses about q{h) imply continuity in pr” we can and do separate 
the r,f. by the dense subset of all y-adic numbers kh ny 是 = 0, … ， l/ 々 n , 
n = 1 ， 2, • • •，為 n = q— n ， q > l integer. 

1° By the covering rule and the first hypothesis in (i) 

户 ['ax I X(k^\)hn ~ ^kh n I ^ g(^n)] 

n ^ZHP[\ 不叫 Mn - I ^ g(h n )} 

n k 

^ Z q{h n )/h n < 00 

n 

so that, by the Borel-Cantelli lemma, max | X(k-{-i)h n ^ ^kh n \ ^ si^n) 

k 

finitely often, that is, there exists an a.s, finite and integer-valued r.v; v 
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such that for all k and n ^ v 

知 +m« — Xkh n I ^ 

Since every y-adic number in I nk = [kh ny (k + l)A n ] is of the form / = 

m 

kh n + ^2 沒 r 力 n+r ， where 设 r = 0 or 1 or • • • or 夕一 1 and m is some 

fs»l 

integer, it follows, by repeated applications of the triangular inequality 、 
that for all k and n ^ v 

m oo 

I Xt — Xkhn I ^ H e r g 、 h n +r) ^ H 《 ( 々 n+r) 

r«l r=»l 

and, because of separability, the same is true for every / C Ink* By the 

00 

second hypothesis in (i), q ^ g{h n -\~r) < «/2 for any given c > 0 and 

n I n t sufficiently large. Therefore, by applying the triangular in¬ 
equality, for n ^ max (« € , v) hence | h \ < H € a,s, positive r,v, and all 
/ C 7", it follows that X t ^.h — Xt \ < € hence Xt is a.s, sample con¬ 
tinuous. 

2° The first two hypotheses in (ii) imply those in (i). Therefore, by 
the second of these hypotheses in (ii), there exists an a.s. positive r.v. v 0 
such that, for all kh ny t C Ink and n ^ v 0y 

00 

I Xt — Xkhn I < ? H s(^n+r) ^ aqg 、 h n ) • 

r»l 

On the other hand, by the first hypothesis in (ii), for every fixed 乂 
Z 尸 [max I X {k+i)hn - X kHn ] ^ g(jhn) \ ^ Z qijhn)/h n < oo 

n k n 

so that there exists a similar r,v. vj such that, for all k and n ^ Vj 

I X(k+j 、 h n — Xkhn I < g{jhn ) - 

Let m be an integer to be selected later and let v = max (vo y •… ， v qm ). 
Given / and h > 0 there exists an n such that mh n < h < qmh ny so that 
there exist k and j with m ^ j S qm such that 

(Jk — \)h n < / ^ kh n < ih + j)hn ^ + A < ( 是 + /•+ l)A n . 

Since 

兄 +h 一兄丨 S I 不 +h — X(k-\-j)h n I 

t 

+ | X(k-i-j)h n — Xkhn I + Xkhn 一 Xt 、 
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it follows that (or n ^ v hence h ^ H a.s, positive r.v, 

I —足丨 < g(j^n) + 2aqg(h n ) $ (1 + 2aq)g{h) = cg{h ). 

If, moreover, given e > 0, for n(^ n € ) and m hence j sufficiently large, 

g{h n )/g{jh n ) < e/2aq then, for n ^ max (» € , v) y hence h <H t a,s, posi¬ 
tive r.v* 

I X t+ h -不丨 < (1 + e)g(jh n ) 客 （1 + e)g(h)y 
and the proof is terminated. 

Corollary. Let Xt with T = [a y b\ be a separable r.f. such that for 
some r(> 0) and all “ t + h C T with h sufficiently small 

E Xt^k — ^ | r ^ p{h). 

If p{h) = c\ h l 1 ^ with s > 0 or p{h) = c\ h |/| log | h H 1 ^ with s > r y 
then Xt is a.s. sample continuous and in the first case for every 0 < a < s/r 
and e > 0 there exists an a.s. positive r,v. H t such that for all L t A- h C T 
and \h\<H, 

I X t+k 一义丨 < (1 + € ) 卜卜 

The assertions follow from the theorem and Markov inequality: 

P[| X t+h - X t \ ^ |-(^)] ^ q{h) = p{h)/g r {h), 

upon taking in the first case g{h) = | h 
and in the second case g{h) = | log | h \ 
q{h) ^ r I ^ |/| log I h " 出 -' 

The fact that p{h) = (| 々丨 1+ ’，s > 0 y implies a.s. sample continuity 
is due to Kolmogorov. 


a so that q{h) = c\h | 1+a - ar , 
一卢 with 1 < < s/r so that 


Application to second order calculus. Analytical properties of second 
order r.f.’s which can be described in terms of their covariances constitute 
the second order calculus. This was done in Section 34 for continuity, 
differentiability and integrability, all in q.m.，and required no con¬ 
cepts introduced in this section. We now proceed to use them and go 
back to the usual abuse of notation: X{t) represents a second order 
r,f. — complex-valued with t varying over an interval [a y b\ y and 
r(/，/’) = EX{t)X{t , ) represents its covariance; it is not assumed that 
EX{t) = 0. We recall that A h X(f) = X{t + 々） 一 X(t) y AaW r(/ ， 〆 ） 


=£△“(/)△ VK〆)，Var r = ff\^r(f y n\ 


and that if F(/,/ ; ) is of 


bounded variation, then the limits r (/ 士 0, /' 士 0) exist and differ from 
r(/，/’）on a countable set of (/ ， /’ ） only, so that the one-sided limits in q.rru 
X t ±o exist and X{t) is continuous in q.rru outside a countable set of 
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values of /. This permits us to extend, with obvious modifications, 
Gavce’s theorem which follows to the case of covariances of bounded 
variation. Since all relations below between r.v/s will be a.s. relations, 
we drop 


C. Second order calculus theorem. Let X(t) y t G [a y b] y be a second 
order r/. with a continuous covariance r(/，/’). Then、up to equivalences^ 

(i) The indefinite integral in q ， m, Y(f) = j X(s) ds exists, X{t) is 

^ a 

its derivative in q,7n” and if 1 oW ^ a primitive in q,m. of X{t) then Y{t )= 

Yo (/) - 凡⑷， • • • • • 

The rj. X{t) continuous in q.m. is Borelian y sample integrable and sample 
square integrable^ and its indefinite sample and qjn. integrals coincide. 

(ii) If A^A\r(/, /) ^ ch 2 for h sufficiently small、then X{t) is sample 
continuous^ is the sample derivative of its indefinite sample integral、and 
for every e > 0 and 0 < a < ^ there exists an a.s. positive r.v. // € , a such 
that I X{t + 々） 一 X{t) I < (1 + € ) 卜 \ a for 卜 I < H ti a. 

If the derivative - r(/, t f ) exists and is finite then X{t) is differentiable 

bt dt f 

in qjn” and if - r (/， /’) $ c f h 2 for h sufficiently small then /⑺ 

dt bt f 

is sample dtf erentiable. 


(iii) IJ the derivatvoe 


^2n+2 


r (/， /’）exists and is finite then X{t) is 


a/ n+l a/ /n+l 

n times sample differentiable y and if is infinitely differentiable、then 

X{t) is infinitely sample differentiable. 

Proof. We use throughout the convergence in q,m. criterion without 
further comment. 

1° Since r(/, / ; ) is continuous hence bounded on [a y b\ X [^, b] y the 
indefinite integral in q.m. Y{t) exists and, as h —> 0, 


E 





X(s) ds - X{t) 


2 


if 


t+h 



h 2 Jt 


T(s y s、ds ds ， 


h 



t+h 


t ~\~h 


T(s y t) ds 


hJ t 


T{t,s f )ds f 


+ r(/，/) — 0, 


so that X{t) is the derivative in q.m. of Y{t). Therefore, if X{t) is also 
the derivative in q.m. of YqW then the derivative △’(/) in q.m. of △(/)= 
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7 (’） 一 （ 7oW — ^o(^)) is 0 while △⑷ =()• It follows that △(/)= 0 

. d _ 

since — E\ A(/) | = £{△(/)△’(/) + △’⑺ △(/) 丨 = 0， and the first set of 
at 

assertions in (i) is proved. 

7(/) is continuous in q.m, hence in pr. and consequently, by the 
measurability theorem, the r.f. X(t) is equivalent to a separable a.e. 
Borel r.f. Boundedness of E\ X(t) | 2 — r(/, /) implying that of E\ X(t) 

= 五叫 X{t) | 2 , the second set of assertions follows from the measur¬ 
ability lemma, except for the assertion of equivalence of Y{t) and the 
sample integral Z(/). 

Since n (/) 二 Y(t) where 


k 

a = Aio < … < t nkn = t, max (t nk — / “ 一 i) 


0 


the assertion will be proved if we show that 

E\ Y n {t) - Z{t) I 2 

= E\ Y n {t) I 2 - EY n (t)Z n (t) - EY n (t)Z(t) + E Z(t) 12 


But 


E\ Y n {t) 


2 



T(s y s f ) ds ds f 


while, by the measurability lemma, 


E\ Z(/) 


and 



X(s)X(s f ) ds ds 



T(s y s f ) ds ds\ 


a ^ a 


k 



私獅 ） =E r(/ w ,,/) dt (/,, - 



o. 


r(/ ， 〆 ） dtdt\ 


The assertion is proved and the proof of (i) is terminated, 
2° Since, by the first hypothesis in (ii )， 

E\ XV + /;) - X(i) I 2 = A.A^rC/,/) 


the first and third assertions in (ii) result from the Corollary of the 
sample continuity moduli theorem with r = 2 and s = l applied to an 
equivalent separable r.f. The second one results from the fundamental 

theorem of ordinary calculus，since Z(co, /) = T X(o) y s) ds where ^(co, s) 
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is continuous in s. The fourth assertion is immediate. The fifth and last 
one results from 


E\ X\t + h) - X f {t) I 2 = 




d 2 

dt bt ( 


r(/ ， 〆) 


H 2 


upon applying what precedes with X{t) replaced by its derivative in 
q.m. X f (/), and assertions (iii) readily follow. 


38.4. Random times. Random times are at the center of random 
Analysis and, especially, of sample Analysis. We already encountered 
them in the case of sequences in §26 and in 32.4. Moreover, random 
times are of the essence in the continuous parameter case. They were 
introduced and used systematically by P. Levy. 

Statistical Physics has a familiar — the “Maxwell demon” who travels 
along the individual paths of particles subject to deterministic laws of 
mechanics; his clock is the same along all paths. In sample Analysis, 
there is now also a familiar 一 the “L^vy demon” who travels along in* 
dividual sample paths of r.f/s, and his “random time” clock varies with 
the paths. In fact, the Maxwell demon is but a degenerate form of the 
Levy demon. 

Let Xt be a r.f. on a “time” domain T Q R. This r.f. is automati¬ 
cally accompanied by the nondecreasing family of cr-fields ((R(X 9y s ^ t) y 
t C T). Intuitively, the observable events up to time / (included), de¬ 
termined by observations of (X 9y s ^ t) y are members of (B ( 足 ， ^ ^ 
Let, say, r be the “time” Xt first reaches a value r(C 尺 ) • This time de¬ 
pends upon the “state of Nature” G fl，that is, r(o>) is the time the 
sample function first reaches the value c. 

It may and does happen that the observable events belong to larger 
cr-fields than those determined by the r.f. Xt* Then Xt is accompanied 
by a cr^field-valued nondecreasing function (Br = (®t, t e T) or\ T and 
we write (Xr y ©r) in lieu of Xt* Once and for all、for all r ^ s ^ t be¬ 
longing tO Ty 

(Be ID ID ®(^r，r ^ j). 

These larger cr-fields may happen in various ways: 

If gr = (gt y t €1 T) here the gt are Borel functions on R to R y then the 
r.f, gT{Xr) = (gt(X t ) y / C 70 is accompanied automatically by the a- 
fields of its own observable events — for every f T y 

出 0§>( 足 )， ^ ^ C s k f) d 
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Yet, such a transformation of (Xt } ©r) with some property defined in 
terms of X T and (& T may preserve this property in terms of gT{Xr) 
and the same (By. For example, if {Xr y ®r) is a “submartingale” 
so is {Xr^y ®r) •— see 39.1< A much more impelling situation arises when 
a property is first defined in terms of alone and then every pair (Xt } 
©r)> or Xt alone, is said to have this property. For example, “independ¬ 
ence” as well as “conditional independence” are defined in terms of 
(r-fields alone ■— see 16,1 as well as 28.3, and so is “Markov independence” 
— see 43.1. This is also the case of “random times” that we introduce 
and study in what follows. 

- Let C 卩 be a nonnull event which we assign the trace cr-field 
ft T = ft Pi of events in To every Borel subset S C 及 we assign 
the trace cr-field 3^ = 3 P\ 6* of Borel sets in S. Once and for ally T de¬ 
notes a Borel set in R. 

We say that a measurable function r on to T is a (S>T-time\^ y for every 

/€ T y 

[r ^ /] G ©«• 

Outside of fl r with < 1, r may have no meaning or may not exist. 
It is frequently convenient then to assign an exceptional value ‘‘/广 to t 
on (fi T ) c , and the definition becomes accordingly: A measurable function 
on to T U [ie] is a (S> T -time if, for every t T y 

[r ^ /] C 


This modification is particularly useful in the usual cases of T 1 = [0, «>) 
or T = {1,2, • • •} where we shall take automatically t e = 十 《>• How¬ 
ever, unless otherwise stated，our (S>T-ti^nes will be on to T. Given (Br = 
(®^, / C T) y we set once and for all 


® <+ 。= tt H © M , (Bt_o = v «„ (B = 




where ‘V’’ stands for the (^fields generated by the union fields. In the 
“discrete case’’ of T = (1,2, • • •），as used in §26, the condition [r ^ /] C 
(B< for every / C T'is clearly equivalent to [r = /] C for every / €1 T, 
and the (B^o, are of no interest since for / = « they reduce to 
(B n ^i and to (B n +x- 
To every (Br-time r we assign 

(B t = {i? £1 (B: B[t s /】 e T). 

Note that every / C Tis a degenerate (Br-time with corresponding fam¬ 
ily (Rt of events. Intuitively, (B T consists of observable events up to time 
t (included). 
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Events belonging to any cr-field with or without subscripts will be 
given the same subscripts, if any. 

A. Elementary properties of (Bt-times. 

L Let r be a (Rr-time. 

(i) (B r is a a -field and t is (& T -measuraile. 

(ii) r A / = min(T，/) and r V / = max(r,/) are (&t-measurable. 

(iii) [t < /]，[t = /]，[t > /] belong to <Sn for every t €1 T. 

(iv) If (S> r -measurable r ^ r, then t is a (S>T-time. 

(v) When T is a finite interval closed on the right, say y T = (0,1] or 
[0,1], then the simple (S^-times 

E I k 一 1 丄 

ik.i 2 

L 2 2 J 

When T is an infinite interval^ say y T = [0, co) or T = [o y «>], then the 
elementary (S>r-times 

00 k 

T n = H 7^ frk 一 1 fcH i T. 

k^l ^ A n 

L 2 2 J 

or 

T n = Tn + ( 00 ) I[t—oo 】 丄 T* 

IL Let a and t be (Rr-fimes. 

(i) If <t ^ r then (& 9 C ®r* 

(ii) <r A t and a\J r are (&T-tiTnes. 

(iii) [cr < r], [cr = r], [a > r] belong to ©<, at = (8^0 

(iv) B 9 [(t ^ r] C ®r/or every B 9 C 

Proof. When not immediate, the proofs are elementary. 

I 0 . We prove I. Properties (i) and (ii) are immediate and property 
(iii) obtains by 

mm •-« 

[r > /] = [r ^ /] c £1 (Bt, [r < /] = U r ^ 1 - C! 

n L 

[r = /] = [r ^ /] — [r < /] C 

In fact, (ii) and (iii) are particular cases of (ii) and (iii) in II. 

Property (iv) obtains by [t ^ /] C ® t since r is (B r -measurable and 
t’ $ t implies that, for every t T y * 


[r ^ /] = [r ^ /] [r ^ /] C 
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Property (v) is immediate, provided one notices that, by (iii), the r n 
are times. 

2。. We prove II. Property (i) is proved as follows. Given 凡 C 
for every t C T y [<r ^ t] D [r ^ t] since cr ^ r, hence 

BAt ^ /] = {B 9 [c ^ /])[r ^ /] C 

since 

^ /] C and [r ^ /] C 
Property (ii) results from 

[cr A r ^ /] = [cr ^ /] U [r ^ /] C 

and 

[cr V r ^ /] = W S t] D [r ^ t] C 

We shall return to property (ill) after establishing (iv): B 9 [<x ^ r] C ® r 
means that, for every / C 7", 

{B 9 [<x g r])[r ^ /] = {B 9 [(X ^ /])([r ^ /])([cr A / ^ r ^ /]) 

belongs to®<，which follows from the fact that each of the three right side 
events belongs to B 9 [<t g /] since B 9 C [r ^ /] by definition of 

©r-time r, and [cr A / ^ r ^ /] since, by (ii), both sides of the inequal¬ 

ity are ®<-measurable. 

Finally, we prove (iii) : [cr ^ r] belongs to ® r by (iv), and so does its com- 
plement [<x > r]. Since cr A r is (B^ Ar-measurable hence also, by (i), 
(B T -measurable, the events 

[cr A r = r] = [cr = r] and [cr A r < r] = [cr < r] 

belong to ® r . They also belong to (B^ since cr and r can be interchanged. 
It remains to show that 队 At = 03^0 ® r . On the one hand, <r A r ^ cr 

and cr A r ^ r imply that Ar C Pi On the other hand, for 

5 C Pi ®r, 

5[cr A r g /] - (B[<x ^ /] U (B[r S /]) C 

hence (B^ O ® r C 
The proof is terminated. 

The elementary properties in A will be used without further comment. 

We consider now pairs (X Ty ®r); recall that every X t is ®<-measura< 
ble. To (Br-time r — to be also called time of (Xr y ®r) we associate the 
function X r on defined by 

X r (o)) = XrM (co) , CO C 公， 
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X r is a r.v. since it is a measurable function on (12,d) to (fi X T ， d X 3r) 
of a measurable function on (Q X T y d X 3r) to (7, 3r) or, for short, the 
composition of two measurable maps: cj (a；, r(w)) and (co, /) —> ^(w). 
We saw that to r is associated the cr-field © r with respect to which it is 
measurable. The question which arises is when X T is © r -measurable ， 
that is, when the observable events in terms of X T are observable events 
up to time r (included) ? In order to answer it in the affirmative，in the 
most important case of sample rightcontinuous Xr on an interval T C R, 
it is convenient to consider “progressively Borel” r.f.’s or, more precisely, 
“progressively Borel” pairs (Xt^ ®r) since they are defined in terms of 
Xt and (Br. To simplify the writing, set T t = {s T: s ^ T] y 
X = Xt and 3 = 3r 5 = 3r r 

(Xry ®r) is said to be progressively Borelian if, for every / C 7", the re¬ 
striction of ^ on X T t to R is X 3<)-measurable. Note that when 
every (&t = (J, this definition becomes that of Borelian Xt* Clearly, if 
(Xr y (Br) is progressively Borelian then Xr is Borelian, 

a, A sample rightcontinuous (Xt^ (Br) on an interval T is progressively 
Borelian. 

Proof. To simplify the writing and also because it is the usual case, 
take T = [0, oo). Given / C for every k ^ 2 n and every s C 

， set X 8 (n) = Xkt/ 2 n and set X t (n) = X t for s = L The 

map (w, s)^ X 8 (n) (w) on (Q X [0, /], (S> t X 3<) to (R y 3) is measurable 
hence, by sample rightcontlnuity, so is the limit X 9 in) of X 9 for every 
s C T. Thus {X Ty ®r) is progressively Borelian. 

h. If t is a time of progressively Borelian (Xr y ®r )， then X T is (B r - 
measurable. 

Proof. To prove that for every Borel set S C 尺 and every / C 
upon setting = r A /, 

[X t C S][t ^ /] = [X Tt C ^][r, < /] U [X t C S][t = /] 

belongs to it suffices to show that X T( is (B^measurable. But this ob¬ 
tains by hypothesis since X u is the composition of the measurable maps 
o) ~~> Tr(c^)) on (fi, (S>t) to (fi X Tty X 3<) and (w〆 ） ~~^ X 8 {o)) on 
(fi X T h (R t X 3r) to (T, 3). 

B. (Bt-measurability theorem. If t is a time of (Xt^ ®r) then X T is 
^measurable when r is an elementary time or when Xt is sample right- 

continuous on an interval T. 
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f ro0 ^ When t is elementary, that is, has a_countable set of distinct 
values A, . then, for every Borel set S (Z R and every / C 

[足 C S][t ^ /] = [Xt C ^][r = tj] ^2 ®r. 
while the sample rightcontinuity case obtains by a and b. 

§ 39. MARTINGALES 

The main r.f.’s and processes studied in this Part V are, by increasing 
order of generality, Brownian, decomposable, martingale and semi- 
martingale, and Markovian types. We begin with martingales and 
semimartingales (submartingales and supermartingales). For, while 
important in their own right, they are extremely useful in investigating 
the other types as well as in the applications of pr. methods to various 
branches of mathematical analysis. 

P• Levy introduced, studied, and used the martingale concept in the 
discrete case of sequences of r.v/s. Then Doob deepened his results, 
introduced semimartingales, proceeded to a systematic investigation of 
these concepts and, thus, made them into the powerful tool (presented 
here and farther sharpened by P. A. Meyer) for pr. theory and for classic 
cal Analysis. 

Section 32 was devoted to a direct study and applications of the discrete 
case; the results therein will be obtained here and, in fact, completed 一 
as very particular cases. 

The approach will be centered on random times and the main theme 
will be preservation of (semi)martingale property under various trans¬ 
formations of r.f.’s，passages to the limit and randomization of time. 

Unless otherwise stated, our s are taken to be separable 、 and our 
ex-fields are subc-fields of a-field of events of some underlying pr. space 

(«, a, P). I 

We use the following notation and terminology: 

1°. ^Tr + = (兄+， tCT) y X T Wc=(X t \/c y tC T) y c C Ry 

I Zr r = (I Z, r, t e T), EXr = (EX h / C /?)• 

2°, (Br = (®r> / C 7") where the ®r are cr-fields nondecreasing with 
increasing /. 

3°. Xt ^ 0 and EXt ^ 0 means that the ^ ^ 0 and the 五不 $ 0. 
Xt is integrable means that the X t are integrable. Xt is uniformly in- 
tegrable means that the X t are uniformly integrable. Recall 一 to be 
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used without further comment, that uniform integrability of Xr means 
that 



sup r Jj5 


^ I —> 0 as PB 0 and sup< E \ X t \ < 


and uniform integrability of X T is implied by sup« E\ ^ | r < oo for 
some r > 1. 


39.1. Closure and Limits. Subscripts of r.v.'s X will belong to 7, 
unless otherwise stated. 

We say that a r.f. Xt is a martingale if, for every s < t and every 

C = ®(X, r ^ /), 

(M) I X 8 = I Xt y equivalently, X 8 = E(Xt I (B«) a.s. 

In fact, it suffices to require that for every finite set T\ < • • • r n < ^ 
and every B C ®(Z n ，• • • ， X n ，X) 

(M ，）f X 8 = I X h equivalently, X = E(X t | - - - ,X n ,X) a.s. 

For, (M) implies (M’）since (S>(X rii …， X rni X 8 ) C and (MO im¬ 
plies (M) on account of the following property, already used in various 
guises, that we isolate now. 

a. Extension of measures lemma. Let a a-jield (B be generated by a 
field C which consists of finite sums of events belonging to a class 3D. If 
<f> and 4 / are signed measures on (S> <r-finite on Q then ♦ S <f/ (<f> = <f/) on 
3) implies the same relation on (B, and if <i> and <f> f are a-finite measures on 
3D then the same implication holds for their unique extensions to measures 
on (B. 

Proof. The assertion about measures on S) reduces to the one about 
signed measures. The equality assertion reduces to the inequalities one 
upon applying that one to <f> ^ <t> f and to <f> f ^ 4>. Since the class of 
events on which 洽 $ 〆 is closed under countable summations and con¬ 
tains 3D, it contains the field C over 3D. Furthermore, upon taking a 
countable partition of 12, we may assume that 舍 and 〆 are finite on this 
class so that it is also closed under monotone passages to the limit, hence 
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contains the monotone field (B over the field 6. The inequalities asser¬ 
tion follows, and the proof is terminated* 

We replace the term “martingale 、、 by ^sub{super)martingale * when¬ 
ever in the defining relations the sign “ = ” is replaced by （“$’’)• 

Various transformations of submartingales yield submartingales for 
which the defining inequalities hold for larger cr-fields of events than 
those induced by the transformed r.v.’s, in fact, for the cr-fields of the 
initial submartingales. Thus, they leave invariant the corresponding 
random times 一 (see 38.4 and b below). This leads to extend the fore¬ 
going concept as follows. 

Let (Xn (Br) be a r.f. Xr with accompanying (Br such that the Xt 
are (B^measurable. We say that (Xt } ®r) is a martingale if, for every 
s < t and every B 8 C 

I X, = I Xt y equivalently, X 9 = E(Xt | (B # ) a.s. 

(Xt^ ©r) is said to be a sub{super)martingale when “ = is replaced by 
（ “2”）. When {Xr y (Br) is one of them so is Xt since (B, D ®d ， 
r ^ s) but, in general this inclusion relation is strict so that (Xr y (Br) 
has more properties—which may be lost if only Xt were considered. 

Elementary properties. The following properties are mostly im¬ 
mediate and will be used without further comment. 

1 • A martingale is a submartingale and a supermartingale. 

2. A semimartingale (Xt } (Br) w“h finite EXt is a martingale if and 
only if EXt ts a constant function on T. 

In the semimartingale case, for s < t y 

X 9 ^ E(X t I ®,) a.s. or X 8 ^ E(X t | (B # ) a.s. 

and both sides cannot be equal a.s. unless the finite EX 8 = EXt finite. 

3. If (Xry (Br) and (X’r ， ®r) are submartingales {martingales) then 
(cXt + c’X’t ， (Br) is a submartingale {mUrtingale) for any constants c y c 9 
in [0, oo) (in R). 

4. If (Xt^ (Br) is a submartingale {supermartingale) , then ( — X Ty (Br) 
is a super mar tin gale {submartingale). 

5. Martingale and submartingale property is preserved when a same con^ 

slant is added to all its r.v's. 

6. If (Xr y ©r) is a submartingale {martingale) then Jor s < /, 

EXs S EXt (EX 8 = EXt) 
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and more generally^ for every (S^-measurable Y ^ 0, 

EX 9 Y^ EX t Y (EX 8 Y = EXtY). 

Remark. According to 4， properties of submartingales (AV ， ®r) 
yield those of supermartingales upon changing Xt into — AV，and con¬ 
versely. Thus, it suffices to study, say, submartingales. 

Let g on R be convex with 叉 (+ oo) = Hm g(x) = Recall that if 

00 

E^X > — oo a.s. then g(E^X) ^ g(X) a.s* and note that if (Xt ， 
(S>t) is a submartingale then, for s < E^.Xt ^ X 9 > — ① • 

A. Submartingale preserving transformations. Let (Xr y (Br) be 
a martingale {submartingale) and g be convex {and nondecreasing) with 
兄 （ +oo) = Hm g{x) = oo. Then CfC^r) ， ®r) is a submartingale. 

X -*00 

In particular, (Xt V c y (Br) is a submartingale for every c R and if 
Xt ^ 0 or (Xt^ (Br) is a martingale, then (| Xt | r , (Br) is a submartingale 
for every r ^ 1. 

Proof. Let s < L For g convex (and nondecreasing) if (Xn ®r) is a 
martingale (submartingale) then 

S (X 9 ) = (^)g(E^Xt) ^ g(X t ) a.s. 

and (^(Xr), (Br) is a submartingale. 

For^(x) = we obtain 

^ E(X t ^ I «,) a.s. 

and (Xr^i ®r) is a submartingale. Since XtVc= (Xt — c)^ + c and 
{Xt — c y (Br) is a submartingale, so is {{Xt — ()+ ， ®r) hence so is 

(X T V r, ® r ). . 

When Xt ^ 0 then, for 叉 (x) = 0 or x r according as x ^ 0 or x > 0, 

X ： g XI a.s. 
and {Xt y (Br) is a submartingale. 

When (Xr y (Br) is a martingale then, for ^(x) = |x| r with r ^ 1. 

I x r = I x t r ^ e^\ x t r a. s . 

and ( I Xt \\ (Br) is a submartingale. The proof is terminated. 

Closure. Let (Xn (Br) be a martingale or a submartingale. We say 
that it is closed on the left {right) by X a (A) if T has a first (last) element 
ct ⑹. Note that for every t d (X Ui u ^ t) ((X 8i s ^ /)) is 
closed on the left (right) by Xt. 
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More generally, we say that (X Ty ©r) is closed on the left {right) by Y 
there are a ^ inf 7(/3 ^ sup T) and a afield (Ba(®/j) with (B(Y) C C 
«r(«(y) C «0 such that, for all / C T, setting = Y(X^ = Y ), 
(^{«)ur, ®(«)ur) ((^ru(i?)j ®ru(/5))) is a martingale or a submartingale as 
is (AV ， (Br). 

Elementary closure properties. The following properties are al¬ 
most immediate and will be used without further comment. 

1. A submartingale (Xt^ (Br) is closed on the right by Y if and only ij、jor 
ull t 

X t ^ E(Y\C&t) a,s. 

2. (Xn ©r) is a martingale closed on the right by Y if and only if 
Xt = E(Y I (S> t ) a,s, for all i ^2 T. 

The “only if” assertion is immediate and the “if” assertion obtains 
from (B, C when s<t by 

X = E(Y I «,) = E(E(Y I ©,) I ©,) = E(Xt | (B,) a.s. 

We shall use a standard limiting procedure for separable r.f.’s: Let 
S n = {^i, * • • j ^n} T ^ = {^ 2 , • • •} which separates Xt. To estab¬ 
lish a relation for X Ti do it for (X ki k = 1 ， •…， n) y then for X Sn by 
replacing k by 4 where the tk are points of S n reordered increasingly, and 
pass to the limit. 

b. Right closure lemma. Let (Xry (Br) be a nonnegative submartin¬ 
gale closed on the right by a nonnegative r.v. Y, and let 0 < c < . Then 

cP(sup t X t > c) ^ J ^ EY 

[suptX t>c\ 

and、for r > 1 ， 

五 (sup, XtY ^ s r EY\ where i + i = 1. 

T S 

If EY < oo then sup< Xt < ① a.s. and Xt is uniformly integrable. 

Proof. 1°. To prove the first inequality, it suffices to show that for 
a submartingale {Xky k = 1， • • • ， 《) closed on the right by Y, 

(1) cPA n ^ I Y ^ EY y where A n = [max Xk > c]. 

J ^n k 

For, 

(2) max X t | sup X t = sup< Xt 

< € 5 rt 3 
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and the standard limiting procedure applies. 

But, upon setting 

B\ = [Xi > c] y B k = [Xk > Cy max Xj < c] for 是 > 1 ， 

A n = Bk hence 

k 

EY^ l y = L f f X k ^cZPB k = cPA, 

^ A n k JBk k ^ B k k 

and (i) is proved, 

Let EY < oo and let r oo. Then 

P(sup t X t > c) ^ EY/c 0, 
hence sup< Xt < 00 a.s. Since 

sup: 戶(不 > c) ^ sup t EXt/c ^ EY/c 0 
and Y is integrable ， 

supr j x t ^ sup t f y 0, 

JlX t >c] J[X t >c] 

hence Xt is uniformly integrable. 

2 U . Upon setting X = sup< it remains to prove 

(3) EX r ^ / EY\ where EX r > 0 and EY r < oo 

since the inequality is trivially true when EX r = 0 or EY r = 00 • 

It suffices to show that (3) holds for nonnegative r.v/s X and Y such 
that, for every r > 0 ， 

⑷ cP(X >r)^| Y. 

J J\X>c] 

Set q(c) = P(X > c) and leg g on [0, oo) be nondecreasing with 兄 ⑼ = 0. 
Then 

Eg(X) = - f 。 g{c)dq{c) ^ q{c)dg{c) 

= f ( f YdP) 姐 = f ( f 仆 . 

Jo \J[X>c] / c Vo C / 

Thus, for^(r) = c r with r < 1， by Holder inequality, 
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sE(YX r ^ 1 ) g siETy^EiX^ 9 ) 1 ^ = s(EY r ) l ^(EX r )^ 9 . 

and (3) obtains upon dividing by (EX r ) 1/9 provided EX r < ①. To 
obviate this difficulty，replaced by Z 八 r so that£(^T A c) r ^ c r < ^ 
and E(X 八 r) r > 0 for r sufficiently large since，as r | <x> y X A c | X 
\\cnct E(X A c) T I EX r | 0. Inequality (4) holds a fortiori for X A c y 
the division mentioned above is valid, and then let c | oo. 

The proof is terminated. 

B. Suprema theorem. Let (Xr y (Br) be a submartingale andletc R. 
Then 

cP(sup t X t > c) ^ sup t EX t 十 

and，for r > 1 ， 

五 (sup, X t ^ > cY ^ sup, E(X t ^y. 

U SU P< 五不 + < 00 then sup< Xt K 00 a^s.y (X 8 \/ c y s ^ t) is uniformly 
tntegrable for every t C T, and 

sup, EX t ^ ^ 2 sup t EX^ - inf, EX t . 

Proof. We use the preceding lemma without further comment. 

The first inequality is trivially true when c ^ 0. Thus, let ^ > 0* Since 
the submartingale ( 足 +，《,， s ^ t) with t C T is closed by Xt^ and 
X ^ 兄 +， 

r 戶 (sup X 8 > c) ^ cP(sup X s ^ > c) ^ EX t ^ ^ sup t EX t ^ 
and, for r > 1 ， 

E(su V x 9 ^y ^ E(x t ^y ^ sup t E(x t ^y. 

S^t 

Thus, the first two asserted inequalities obtain upon letting/ vary over T. 
Let sup« EX t ^ < oo , The first inequality yields, as r — ①， 

P(sup t Xt > c) ^ sup t EXt^/c 0, 

hence sup« ^ < + 00 a.s. The last inequality obtains by 

E\X t \ = 2EX t ^ - EX t ^2 sup, EX t - in^ EX t 

and it remains to prove the asserted uniform integrability. 

Since the martingale (X 8 ^ y (B„ s ^ t) with t C. T is closed on the 
right by integrable X t ^ y the assertion is valid for (X + , s ^ /). Since 
\X \/ c \ = \ (X — r). + + r I ^ + I r I, the asserted uniform inte- 
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grability for (X 8 W c y s ^ i) with / C 7* follows, and the proof is com¬ 
pleted. 

Martingale times. When (Xr y (Br) is a martingale or a semimartingale 
it is convenient to say that (Br-times (defined on to T) are its times or 
martingale times. The following proposition is central to the whole 
section. 

c. Central lemma. Let a ^ r ^ n be times of a submartingale 
{martingale) (Xk，®*；)，k = 1， • • • ， 《• 

If EX,, and EX r exist、then d ，（ B ，）and (X Ti (B r ) form a submartingale 
{martingale) and 

EX, ^ n EXr ^ EXniEX! = EX, = EX r = EX n ). 


If EX, > 


or EX n + < + 00 then all EX T exist and 

E\X r \ ^ 2EX n ^ - EXi. 


Proof. If 5 *； C and j > k then Bk[r < i — 1】 C ®y-i，hence 





k i 


B k l r«;] 


Xj + 




Xi 


and, summing over / = 是 + 1 ， 


f x k g I ： f X.= f X T . 

^ B k [r>k] —fc+1 ^B k [rmmj] ^B k [r>k] 

Since for B C we can take Bk = = k] and, for j > k y [<r = k] 

[t < k] = 0, it follows that 



B ， [ 丨 awfc 】 


0 Jb k 

9 



B 0 [<r^k][T>k] 







and, summing over k y 






X T ' 


In particular, EX ^ EX T so that, taking 1 = cr / ^ cr ^ r ^ 

EX, ^ EXd EX r ^ EX n . 

In the martingale case all foregoing inequalities become equalities. 

Let EXi > — oo hence the EXk > 一 ® ， or EX n ^ < 00 hence the 
EXk^ < 00 . Then，for every martingale time t ^ n y 
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EX r 


L EX k I\ 

k 


•a ：】 > — 00 or < + oo exists, 


Finally, since (A+ ，（ B* ， 是 = 1， •••，《) is a submartingale hence 
EX r ^ ^ EX n ^y it follows that 

E\X r \ = 2EX T + - EXr ^ 2EX n + - EXi. 

Applications. The basic properties of submartingales stem from the 
above lemma. On the one hand, it yields the inequality (and more) from 
which A stems. On the other hand，it yields the ‘‘crossings’’ inequality 
from which the limit properties stem. 

Let (Xky 走 = 1， • • • ， ”）be a submartingale with EX\ > — ① 
or EX n ^ < + oo • 

1°. Two inequalities. Let cr(co) (t(w)) be the smallest k ^ n y \f any, 
such that Xk(o)) > c (Xk(o^) < c) or be n if there is not such k. Set 

Bn = [sup Xk > c](C n = [mfXk < c]). 

k k 

Apply the lemma to <x ^ n and 1 ^ r so that 

cPB n ^ I ] X n 

J B n ^B n 

and 

EX x ^ I X T -h f X r ^ cPCn + T X . 

J c n Jc n c Jc n c n 

Let now (Xt^ ®r) be a submartingale with EXt > — oo or EXt^ < / 
+ 00 and use the standard limiting procedure with separating set 
S = {i*i, A，• • •} and S n = , s n \ f S. If the submartingale 

is closed on the right by Y y then the first inequality yields 


cP 



> c) ^ 



Y 


and, passing to the limit, 

cP(snpt X t > c) ^ 

This is the main inequality in b. 

Similarly, by the same procedure, the second inequality yields 

infe EXt ^ f Y rP(infe Xt ^ c). 



[sup<X< > c] 
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This new inequality completes b. 

2°. Crossings. The number H n (^) of crossings (from the left) by 
(Xi(co)，• • • ， X n ((^)) of a finite interval [r, s] is the number of times 
that starting with ^i(w) and proceeding to X n (w), we pass from the left 
to the right of [r, s]. Let ri(w) = 1 ， T 2 (w) be the smallest k ^ n(cj), if 
any, for which Xk(o)) ^ r, r 3 (w) the smallest k ^ r 2 (o；), if any, for which 
^ s and so on, proceeding alternately up to r n (w) = n; from 
the first undefined Tj(o)) y if any, set rj(w)= … =r n (w) = n. H n is 
also the (random) number of crossings of [0, j — r] by the submartingale 
formed by the ((X k — r) + , (B fc ). Set Y k = (X k — r) + so that 

Y n = t (y r/ - y r , J + Yi 

and, by the lemma, the (Y Tii (S> Ti ) form a submartingale. 

Let EY n < 00 so that 0 ^ EY ri < oo and, in 

EY n = i ； 五 - y ri j + 

i-2 

the sum of the first H n summands with odd^ is at least (s — r)EH n while 
the remaining ones are nonnegative. Thus 

(s — r)EH n ^ E(X n — r) + 

and the inequality is trivially true when E(X n ~ r) + •"■j*** 00 • Note 

that H n does not decrease as n increases. 

Let now (Xry (Br) be a submartingale with EX\ > — oo or EXt^ < 
+ oo. The standard limiting procedure starting with H n replaced by 
H s Jr^s) yields the basic crossings inequality 

{s — r)EH T {r y s) ^ sup< E(Xt — r)+; 

we shall omit the subscript T y unless confusion is possible. 

Limits. We begin with 

d. Sample limits lemma. If (Xn ®r) is a submartingale with 
sup t EX t ^ < oo, then there is a null event N such that for ^ ^ N y the 
discontinuities of sample functions Xt{^) are simple. 

Proof. Since 

(s — r)EH{r y s) ^ sup/ E(Xt — r) + S sup^ EXt^ + | r | < 00 ， 

H{r y s) < oo outside a null event N(r y s) for every finite interval [r y s]. 
Thus Xt{<^) crosses every such interval a finite number of times only for 
o) ^ N(r y s) . It follows that its discontinuities are simple for 
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a)G U N(r,s). 

rational r,s 

For, if it is a left limit point of T with 

liminf < Xt^oico) = limsup Xr(co) 

V t t ~~ V t t 

then there are rationals r, s such that 

Xt^o(o)) < r < s < X ， 一 o(co) 

so that，as 〆 丁 / ， Xt{aS) is less man r and greater than s infinitely often 
hence a) C Thus, 一。 (w)= 又一。 (w) for o) ^ N. Similarly for 

rightlimit points, and the lemma is proved. 

Discussion. To avoid repeated mention of exceptions, we include all 
null events in every (S> t (replacing the initial (S> h if necessary, by the cr-field 
generated by it and the class of all null events. Thus，the above ex¬ 
ceptional null event N belongs to every (S> t and we either speak of almost 
all sample functions or we replace Xt(^) for o) N by the constant func¬ 
tion zero 、 according to convenience. The r.f. so obtained is a.s. equal to 
the initial one and the submartingale or martingale property is pre¬ 
served. In what follows, to return to the initial one it suffices to add 
“for almost all sample functions.” 

Let T f be the set of (left or right or both) limit points of T so that the 
closure of T is 7" = T U ^ = TT fc + T f \ set once andfor all a = min T 
and b = max T. To avoid further exceptions, note that the state¬ 
ment of the following theorem is trivial when a (J?) belongs to T but not 
to T\ provided we set X a ^o = X a (Xb-o = 不 ). Thus, we can assume 
therein that a{b) is a limit point. 

C. Sample limits theorem. Let (Xn (Br) be a submartingale. 

I. Let Xt^ be integrable. 

(i) At lejtlimit points t < b or {t ^ b when b C T and then Xt^ is uni¬ 
formly integrable ), sample leftlimits 义 exist, are (Sn^measurable and, 
for s ^ i < u with s，u G T ， 

(1) X 9 ^ E{Xt—o\(^$) u” Xt^o = o) 这.二 

Xt^ is uniformly integrable if and only if Xb^ exists and closes on the 
right the submarttngale (Xt y ©“ t < b). 

(ii) At rightlimit points /, sample right limits Xt^o exist，are ©t+o- 
measurable and y for s ^ t < u with « G T y 

⑵ X 9 ^ E(Xt^o\ ®«) ci.s.y Xt ^ E{X U \ (B«+o) 

the submartingale is closed on the left by X a ^o or by X a when a ^ T. 
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II. Let Xt be integrate. 

(i) Inequalities (1) and (2) hold 、 X a ^o is also a limit in the first mean if 
and only if lim EXt > — 00 ; then all sample right limits are also limits in 

t \ a 

the first mean. 

In the martingale case y inequalities (1) and (2) become equalities^ the if 
and only if condition is satisfied and its consequences hold. 

Furthermore^ integrable Xb^o exists and closes on the right the martingale 
(Xt y (S>t y t < b) if and only if Xt is uniformly integrable\ then all sample 
left and right limits are also limits in the first mean and inequalities (1) 
with t S b、and inequalities (2) become equalities. 

(ii) If \Xt T is uniformly integrable for some r ^ l or sup« E\Xt f < 00 
for some r > 1 when Xt ^ 0 {when (Xt^ (Br) is a martingale、 ， then all sam¬ 
ple left and right limits are also limits in the r-th mean and inequalities 
{equalities) (1) with t ^ b and inequalities {equalities) (2) hold. 


Proof. 1°. We prove I; by hypothesis Xr^ is integrable. 

For every leftlimit point / < ^ C T f there is « C 7" with t < u < b 
since ^ is a leftlimit point. Thus, the submartingale (/•+ ， (B # , s ^ u) \s 
closed X u ^ hence sup EX $ ^ = EX U ^ < 00 and, by d, X t -o exists; since 

all X 9 are ® 3 _。 measurable for j < /， so is X 8 ^o- By B, (X $ W c y s ^ u) 
is uniformly integrable hence, for j < 〆 < / and B s C (B«, as t* \ /, 

f (X s V c) ^ f {Xv V c)— f (Xt^o V c). 

^b 9 ^b 9 


By the Fatou-Lebesgue theorem, X t _ 0 ^ is integrable since the X 9 ^ are 
integrable- Hence so is X t - 0 V c and, letting r 丄一 °°， 



Similarly, for i ^ « G T and 5 C U (B, C V (B« = 


X d - f £ d V M X d v A 

and, by a, this inequality holds for every B t -o G ®t-o hence, letting 
r 丄一①， 
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Thus, inequalities (1) and, proceeding similarly, inequalities (2) obtain. 

Finally, if Xt^ is uniformly integrable so that sup^EJ£T+ 〈①， c ap¬ 
plies and X b _ Q exists. Thus, the above proof for leftlimit points holds 
for i = b and (X h (R h t < b) is closed on the right by X b _ Q with 
EX b _ 0 ^ < oo. The converse obtains by b applied to (B<, t < b) 
closed by X b _ 0 ^. 

2°. We prove W\ by hypothesis Xr is integrable. We use I without 
further comment. 

Let t < u with t 3 u ^ T so that EX t ^ EX U < + 00 hence a = 

11m EX t < + 00 . Thus, a > 一 oo is equivalent to: a is finite. But then 

t 4 a 

E\ X t \ ^ 2EX t + - EX t ^2 EX U ^ 一 a < « 

so that )3 = sup JS| I < <»• Upon setting Bt = [\X t \ > c] y it fol- 
lows that, as c —» + oo y 

PB t S E\ X t \/c ^ jS/r —> 0 uniformly in i ^ u. 

But, given € < 0, there \s v ^ u with v ^ T such that EXt 一 a < € 

hence EX V — EXi < e for t ^ v. Therefore, 

f \X t \^ 2] I X t - EX t ^ if f X v — EX t 

J B t J B t ^ B% J Bt ^ B t c 

^ I \X V \ + EX V - EXt < f I ! + € 

J B t 

so that, letting r —> + 00 then € —» 0 ， 

sup f !^ f !z v | + 6 - >o, 

Thus, the Xt are uniformly integrable for a < l ^ v and Xt X a ^o as 
t I a hence EX t —> EX a ^ 0 finite. Conversely, if Xt -4 X a ^o as / 丄这 
then a = EX a ^ 0 is finite so that the condition a > — oo is equivalent to: 
there is a t? C T such that the Xt are uniformly integrable for a < t ^ v. 

In fact, the preceding argument applies to every rightlimit point 
s > a since ^ is a limit point so that there \s u T with s > u > a y 
hence 

a, = lim EX t ^ EX uy 

t i $ 


and the first part of (i) obtains. 
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For the martingale part of (i), let {Xr y (Br) be a martingale. Since 
all EX t are equal, lim EX t is finite and the first part applies. Further- 

t \ a 

more ， t ^ u ^ T) being uniformly integrable since X u integrable 
closes it on the right, sample leftlimits are also limits in the first mean 
for all leftlimit points t b for, b being a limit point for every / 〈么， 
there is « C ^ with t S u < b. Thus, inequalities (1) become equali¬ 
ties. In fact, these equalities (1) hold for t 4 b when Xt is uniformly 
integrable so that X $ X t s t S b. Conversely, if integrable Xb^o 
exists and closes the martingale (X h (& h t < b) then, clearly, Xt is 
uniformly integrable, and the martingale part obtains. 

We prove (ii). The case of \Xt \ t uniformly integrable for some r ^ 1 
is immediate. The case of supe E \X t | r < 00 for some r > 1 when 
Xt ^ 0 or (Xr y (By) is a martingale follows, by A, from the fact that 
(I Xt T, ®r) is a submartingale with, by B, jE(supe| Xt |) r > 00 . 

The proof is terminated. 

Consequences. We use C without further comment. 

1°. Extension. Let (Br) be a positively integrable submartingale 
{an integrable submartingale ). Then it can be extended to a submartingale 
{a martingale) on the interval J with endpoints a < b provided b is excluded 
unless b [ T and a is excluded unless lim EXt > —a. 

% \ (X 

For every rightlimit point t e J — T set Xt = 尤 +。 and (Rt =(B， + o 
so that this extension is a submartingale (or a martingale). Then, for 
every [c y d] whose endpoints only belong to T y set Xt = Xd y (Rt = for 
c < t < d y and also for / = r unless X c is already defined. 

The continuous parameter case is that of intervals T on R with end¬ 
points a < b which may or may not belong to T. According to 1°, this 
can always be achieved under minimal usual assumptions. Then 

2°. Right regularization. Let (Xr y ©r) with continuous param¬ 
eter be a positively integrable submartingale {an integrable martingale). Set 
X b ^, Q = Xb when b G T and X/ = (B/ = (S> t ^for every t ^ T. 

, (Xt’ ， (Rt) is a submartingale {martingale) which is sample rightcon- 
tinuous with sample lejtltmits. 

If Xt is rightcontmuous in pr. y or EXt and (Br are rightcontinuous ， 
then the rj.’s X f r and Xt are equivalent. 

The first assertion is immediate. If Xt is sample rightcontinuous then, 
for every t d Xt n ^ XtB.sU i / hence X in , X t as U f | / along some 
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subsequence (/„') of (/ n ). Since X\ - X t ^ = lim X t , y it follows that 

u \ t 

Xt = X t a.s.，and Xr is equivalent to Xt* 

If (Br is rightcontinuous ((B i+0 = (&t for every / G T 1 ), then X t+ o is (B e - 
measurable，hence X t = E(X t ^ 0 \ (S>t) = X i+0 a.s. Thus, if moreover 
EXt is rightcontinuous (EX U EX t as « | / for every / C T) then, 
for i < u y 

Xt ^ ^ E{X U I (B^) a,s. 

hence Xt = 尤 +o a.s. and X f r is equivalent to Xt- 

The discrete parameter case will be that of intervals Tin the extended 
line of integers { 一 00 ， • * . ，一 1 ， 0 ， 1， • . • ， + 00 }; leaving out finite se¬ 
quences which have no limit point, there are two-sided sequences (• • • ， 
一 《， • • • ，一 1 ， 0 ， 1， • . • ， +«，.••) with two (left and right) limit points 
— 00 and + 00 and one-sided sequences either beginning or ending at 
some integer and, without loss of generality, we can take the integer to 
be 0. Thus, we have (0 ， 1， • • •）with one (left) limit point + 00 or 
(• • • ，一 1 ， 0) with one (right) limit point — ①； they may or may not be¬ 
long to T. If {X ny (Bn, n = (• • • ， 一 1 ， 0)) is a submartingale or a martin¬ 
gale then, setting Y n = X^ n and Q n = (B_ n > we have a submartingale or 
martingale reversed sequence (Y ny Q ni n = 0, 1， • • •）with Q n | as n. 

The reader is invited to restate the preceding propositions in the 
discrete case and compare with 32.2 and 32.3. 

The foregoing discussion following C leads us to consider from now on 
two types of martingales and submartingales and two minimal kinds of 
integrability. 

The two types are discrete and rightcontinuous (Xt^ (Br )； we say that 
(Xn ®r) is rightcontinuous \f T d R is an interval with endpoints a < b 
and Xt is sample rightcontinuous. 

The two minimal integrability conditions are positive integrability for 
submartingales integrable) and integrability for martingales {Xt 

is integrable); positive integrability for submartingales implies negative 
integrability for supermartingales hence integrability for martingales. 

39.2 Martingale times and stopping. In the preceding subsection 
it was shown that submartingale or martingale property of (Xt^ ®r) is 
preserved by a family of transformations of Xt and, under some inte¬ 
grability conditions, by sample left or right passages to the limit. In 
this subsection it will be shown that this property is preserved, under 
similar integrability conditions, by “randomizing” the times—-replacing 
times / C 7" by martingale times r. Recall, to be used without further 
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comment that，by 38.4B, X T are ©^-measurable in the discrete and in the 
rightcontinuous cases—which are the cases investigated herein. 

The approach consists in extending the central lemma and then using 
a martingale time r as stopping time (r A /, / G T) y so called because 

X T — Xil [e<r] = 


yields r.f. (Xt y i €1 T) the stopped at time r. 

Lemma 39.1c is still central. It has two characteristics: Its martin¬ 
gale times are simple times and the submartingale is closed on the right. 
Taking these into account, we reformulate the lemma for the general 
setup (Xry ®r) with the “usual” integrability assumptions: Positive 
tntegrability ^ that is, Xt^ integrable, for submartingales and integrability 、 
that is, Xt integrable, for martingales. 

a. Extended central lemma. Let <r ^ r be times of a positively inte¬ 
grable submartingale {an integrable martingale) (Xn ®r)« 

I. Let <r ^ r be simple times. Then (X^y ®<r), (X Ty (B r ) form a posi¬ 
tively integrable submartingale {an integrable martingale). 

II. Let (Xfy ®r) be closed on the right by 

(i) The r.v.’s X T W c (the r.v.’s X T ) where r varies over all simple times, 
in fact, over all r ^ b such that (X Ti (B r ), (X bi (S> b ) form a submartingale 
{martingale) are uniformly integrable. 

(ii) If a ^ t ^ b are times of rightcontinuous {Xr^ ®r )，the foregoing 
conclusions hold. 


Proof. 1°. Assertion I obtains upon replacing X\ y • • •，/ n in the 
central lemma by X tv -^X tn with h < … < t n being all the possible 
values of <r and r. Assertion I(i) obtains exactly as for t in lieu of r in 
39.IB; we repeat it for completeness: In the submartingale case, it 
suffices to prove uniform integrability when c = 0 y that is, for the Xr^. 
This results, upon letting r—> + 00 , from 



哪 ） m l【 Xr+>c 】n j [Xr+>e] ^ E ^ + < - 



hence 


by 


su 


p r P{X r ^ > c) ^ EX^jc ^ 0, 



SUPr f [x/>c] - SUPt J[X T + >c] Xb ^ 



0 . 
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The martingale case follows from the fact that then (| X T |, (Br) is a 
martingale; the simple times case obtains from I by taking t ^ t — b. 

2 0 . It remains to prove II(ii). T is an interval with end points 
a < b ^ T. Without restricting the generality, we can take a = 0, 
^ = 1. 

(For, we use only the linear order structure of the time set T C R y and 

R is order isomorphic to [-1, +1] by | Arc tan t so that T becomes 

its subinterval with endpoints a 1 < V which is order isomorphic to the 
interval with endpoints 0 and 1 by / 一 (/ 一 a f )/{b f — a f ). 

The argument parallels the one for 36.1C. By 38.4A there are sequences 
of simple times <r n ^ r n with <r n i <r, r n i r so that, by rightcontinuity, 
X 9n —> X 9y X Tn -> X r and, by II(i), the V c and the X Tn V c are 
uniformly integrable. Since <S> 9n D (B^, by I, for B 9 C 




and, upon letting n 


(KWc)^J B ^X Tn Wc) 

^ then c I 一 0 0 ， 



X < 



r 


} (T 


obtains so that, X 9 being ©『measurable according to 38.4B, {X” (B^), 
(X Ty (B r ) is a submartingale. Upon taking r ^ II(i) applies. 

In the martingale case, the above inequalities with X 9fi V c y X Tn V c 
become equalities with X an 、 X Tn instead and lead to final equality. The 
proof is terminated. 

A. Martingale times theorem. Let <x ^ r < b be times of a posi¬ 
tively integrable submartingale {an integrable martingale) {Xr^ (Br), either 
discrete with 7" = (0 ， 1， • • •) and b — or rightcontinuous with T = 
(a y b) or [a y b) in R. 


CO If 


liminf 





lr>t) 


liminf I I ZJ 

， t^b J lr>t) 1 


then d ，©，) ， {X Ty (B r ) form a positively integrable submartingale {an in 雜 
tegrable martingale) • 

(ii) For all r ^ b when Xt^{Xt) is uniformly integrable also for all 
t S u d the foregoing conclusion holds and the X r ^ V c {the X t ) are 
uniformly integrable. 
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Note that (Xt, ®r) is closed on the right by a positively integrable 
(an integrable) r.v* if and only if Xt^(Xt) is uniformly integrable. 

Proof. 1°. We prove (i). Let tx t = a A t y r t = t A /, / G T ， so that 
<Tt ^ rt ^ i are stopping times of the submartingale (the martingale) 
(X 9i (B„ s ^ /). Since 

Bfflcr ^ /] C for B 9 C and [cr ^ /][r ^ /] = [r ^ /], 


⑴ 



X < 





X 


BAr^t] 


T 



X 




obtains from all. The last integral is bounded from above by 



lr>t\ 


X n - 


hence，by hypothesis, upon letting t—^b along a suitable sequence ， (1) 
yields 



In the martingale case (1) is an equality and the last Integral being 

bounded by I | / |，by hypothesis, upon lettmg t — b along a suita- 
ble sequence，equality in (2) obtains. 


2°. It remains to prove (ii). If ^ G 7" or all r ^ « C (ii) obtains 
from all. If b ^ T and X t ^(Xt) is uniformly integrable then, by 
39.1C the sample limit Xb^o at b exists and we can close the submartin¬ 
gale (the martingale) on the right by Xb = Xb-o and, thus, have b e T. 
The proof is terminated. 

Corollary. Stopping at (S>T-time t of a positively integrable submartin¬ 
gale or an integrable martingale (Xr y ®r) preserves these properties. 

Remark. Since the martingale or submartingale property is in terms 
of ordered pairs, whenever there is a martingale or submartingale as¬ 
sertion with a < i then，under the same hypotheses, they can be re¬ 
placed by pairs r Wl , t U2 with U\ < from times (r tt , u ^ R) with r M non¬ 
decreasing with u increasing. 

We are now ready for P. Levy's martingale characterization of Brown¬ 
ian motion (see Chapter XIII). 

(Xry ®r) with T = [0, oo) and 义 o = 0 is a Brownian motion if it is sam¬ 
ple continuous (right continuous at 0) and decomposable, that is, 
increments X 9t on disjoint intervals [s y /) are independent, with £( 尤 “） = 
31(0, i — s). Thus, for s < /， it is square integrable, 
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E(X t \ = E(Xt - X| + X = X a.s. 

so that (Xr y ®r) is a martingale, and 

E(XJ\ (B,) = EXJ = t- s a.s. 

so that these conditional variances of the increments degenerate into 
variances. We prove the converse: 

B. If (Xr y ®r) with T = [0, «>) and = 0 is a square integrable sam¬ 
ple continuous martingale with 

E(X 8t 2 \ (B.) = / - i a.s., 

then (Xr y (Br) is a Brownian motion. 

Proof. It suffices to prove the assertion for every interval [a, b] C 
[0，<»). In fact, it suffices to show that £(Xb — X a ) = 91(0, b —a). 
For，the same argument applies, with obvious modifications, to every 
fixed linear combination and then normal increments on disjoint inter¬ 
vals being orthogonal, will be independent. 

Without loss of generality, we can take a = 0 ，彡 =1 to simplify the 
notation and, to avoid repetitions in what follows, r, s y t will belong to 
[0,1]. 

Given « > 0, let r n (w) be the first /， if any, for which max — 

兄 (co)| = € } — the maximum being taken over all 0 ^ r, s ^ t with 
\r — s\ ^ l/n; if there is no such / let r n (co) = 1. By sample continuity 
r n is time of the martingale (X ty 0 ^ ^ 1) since given 0 < / < 1, 

for all rationals r, s with 0 ^ r y s ^ i y \ r —s \ ^ l/ n and integers m = 

1，2,… 

[ Tn > /] = u fl [| Z r - Z. I ^ 

m r p M 

also r n depends upon the given €, 0 < r n ^ 1 and r n A 1 = r n —> 1 as 

Yl — > oo . 

The preceding Corollary applied to the integrable martingale (X ty (B 0 
0 ^ ^ 1) and submartingale (X t 2 y 0 ^ ^ 1) yields the sample 

continuous martingale (X Tn a t y ©r nAi , 0 ^ ^ 1) and EX Tn At = 0, 

EXr nA t 2 ^ EX, 2 = 1. 

Partition [0, 1] into n equal subintervals and denote by Y n k = 
Xr n Ak/n — X n A(fc~i)/n the increments of X tn M on the 是 -th interval, 
是 = 1， • • • ， w. It follows that X Tn = IZ Ynk and | Y n k ) ^ ^ with 

k 

EY nk = E(Y nk \ Y ny J < k) =0a.s. 

EY^ = E{Y n1 ?\ Y ni J <k) ^ l/n a.s. 
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Therefore, as € —> 0, if n = n(e) —> <», then 

L E\ Y nk I 3 ^ eZEYj ^ e^O 

k k 

and, by 31.3A, £ (X* n ) converges to a normal law. Since, given € > 0, 
by sample continuity, X、— X iy there is a sequence n = n{t) ― as 
€ ->0 such that £(X tn ) ^ £(X x ) with EX, = 0, EX^ = L Therefore 
£( 尤 ） = 91(0, 1)，and the proof is terminated. 

§ 40. DECOMPOSABILITY 

Decomposability sprang forth fully armed from the forehead of P. 
Levy (1934). His analysis of “integrals with independent elements’’ or 
“r.f.’s with independent increments’’ or “additive r.f.’s’’ or “differential 
rX’s” or “P. Levy r.f.’s” or, as we shall call them, “decomposable r.f.’s” 
was so complete that since then only improvements of detail have been 
added. Before his work there were only pioneering ones by de Finetti 
(1929) and Kolmogorov (1932). The only decomposable r.f/s known 
were the Poisson and the Brownian processes, both born from physical 
phenomena. (After a pioneering work by Bachelier (1900), the first 
rigorous study of the Brownian process was by Wiener (1923) who dis¬ 
covered its a.s. sample continuity.) Furthermore, the basic concepts 
and problems of random analysis appeared in and were born from the 
P. Levy analysis of decomposability. Thus, decomposability is at the 
root of the concepts and problems of random analysis. 

40.1. Generalities. A r.f. Xt is said to be decomposable if its incre¬ 
ments X 8 t = Xt — X 9 for disjoint intervals [j ，/) are independent. A 
decomposable process on T is the family of decomposable r.f/s (on some 
pr. space) with the same increments. But there is a one-to-one cor¬ 
respondence between the increment function on T — an additive r.f. 
X 8t of intervals whose endpoints belong to T — and the family of r.f.’s 
on T defined by X t = X a + X a t or X a — X ta according as / ^ ^ or 
t < a y where a C. T and X a are selected arbitrarily. Therefore, we may 
consider a decomposable process on T as a decomposable r.f. on T de¬ 
fined up to an arbitrarily selected value X a at an arbitrarily selected 
point a T. Thus, a decomposable process will be represented by one 
of its r.f.’s Xt either an unspecified one or selected according to con¬ 
venience, but always separable. 

The law of a decomposable process determined by the joint laws of all 
its finite sections is, in fact, determined by the individual laws of its in- 
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crements. For, because of decomposability, setting f s t{^) = E exp 

\i^X 8i ) y the joint law of, say, X ac and X bd with a < b < c < d\s given 
by 

五 exp \iti\X ac -f- iti 2 Xbd } = E exp \iuiX a b " 2 ) 尤 6 。 + i^ 2 ^cd] 

=fab(u\)/ bc (ui + u 2 )fcd(u 2 ). 

Also, if Xt is one of the r.f/s of a decomposable process then, upon setting 

=E exp [iuX 8 } and ^ H - h u ny we have f t = j 山 t and, 

for t x <•••/„， 

E exp \iu\Xt x + . • • + iu n Xt n \ 

=£ exp \iv\Xt x + + •. • + tv nXt n _ x t n ) 

—f' - 

Therefore, if the individual laws of all increments are of a specific type, 
say, normal or Poisson or infinitely decomposable, we shall say that the 
process is of this type: normal decomposable or Poisson decomposable or 
infinitely decomposable. In fact, we shall find that deleting the fixed dis¬ 
continuities the remaining part of any decomposable process is infinitely 
decomposable. It is by a deep sample analysis of this remaining part 
that P. Levy discovered the general form ofinfinitely decomposable laws 
with ch.f/s e^ y \f/ = (a, . Since we have at our disposal this general 

form (§22), we shall proceed from it to the sample interpretation. In 
order to do so, it will prove convenient to write 沴 in P. Levy’s form 
^ = (a, 0 2 y L) y explicitly 

f 0 广 +0 V . iux \ 、 

少 = tau — ~ + T [e xux — 1 — ——~g) dL(x) 

2 , —00 ' 1 十 x / 


where the bar which crosses the integral excludes the origin from the 
domain of integration and where we can and do take L( — 00 ) = L(+oo) 
= 0. The correspondence between the two forms of ^ is given by 


and 


1 + 

j3 2 = 承 (+0) — f (― 0 )， dL{x) - - - d^{x) for 〆 0 

x z 


十 1 

Var ^ < 00 <=> 4 - x 2 dL{x) < 00 . 

J ^.1 


In order to reach the infinitely decomposable part of the process, we 
shall have to delete its degenerate discontinuities by "centering” it, then 
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delete the “fixed discontinuities” part. This will require recalling (Sec¬ 
tions 16 and 17) and adding to convergence properties of series of in¬ 
dependent summands, as follows. 

Let X\ y X 2 y … be a sequence of independent r.v.’s with ch.f/s 
f\ y / 2 y … • The series X) is said to be convergent if it converges a.s. 

• n n 

to a r.v., equivalently, if IIA — /ch.f., or if YL/n > 0 on an argument 

set of positive Lebesgue measure. The series X) is said to be es¬ 
sentially convergent if there exist centering constants c n such that the 
series d 一 c n ) is convergent, equivalently, if the series X) of 
symmetrized summands is convergent, or if II |/ n | 2 > 0 on an argument 
set of positive Lebesgue measure. Clearly, if c n are centering constants 
then c f n are also centering constants if and only if the series X) ((n 一 （ ’n) 
converges (to a finite limit). 

a. If the series X n is essentially convergent y then the constants c n = -r n 
一 s n ^i(s 0 = 0), determined by the relation E Arctan (^ n 一 J n ) = 0 where 
S n = X\ + - — h X ny are centering constants. 

Note that since Arctan is a bounded increasing and continuous function, 
the stated relation determines the (finite) constant s n . 

Proof. By hypothesis, there exists a sequence s f n such that S n — s f n 

二 4 r.v. Therefore, for every subsequence n f such that s n * — s f n * s 
finite or not, 

S n f — S n t = (S n f — J’nO — — - > S — S 

and, by the dominated convergence theorem, 

E Arctan (S — s) = lim E Arctan (S n f — s n ^) = 0. 

Thus, the constant s is finite and independent of the subsequence n f so 

that the whole sequence s n — s f n > s y and S n — s n S — s r.v. The 
assertions follow. 

b. If there exists a r.v. X such that the r.v.'s Y n defined by X\ + • • • + X n 
Y n = X are independent of X\ y • • •, X ny then the series X] is es¬ 
sentially convergent. 

n 

Proof. Let g n = IJ/a ； and let h n and / be the ch.f/s of Y n and X. 
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By hypothesis and because ch.f/s are continuous and bounded by/(0) = 1, 

I I 2 ^ gn | 2 | h n I 2 = |/| 2 > 0 on some neighborhood of the origin. 
Since | ^ 2 = lim | g n 2 = IT | /n | 2 exists, it follows that ^ 2 > 0 on 
this neighborhood. The assertion follows. 

The series is unconditionally convergent if it is convergent under 

all reorderings. Constants c n such that the series ^2 (X n — c n ) is un¬ 
conditionally convergent are said to be unconditionally centerings and 
constants c f n are so if and only if the series X] (c n — c f n ) is absolutely 
convergent, since for numerical series unconditional and absolute con¬ 
vergence are the same. For example, if the second moments EX n 2 are 
finite and X] < 00 then the series X] (Xi 一 EX n ) converges in 
q.m. hence is convergent. In fact, it is unconditionally convergent since 
so is the series X] (and the EX n are unconditionally centering con¬ 
stants), while the series X] X n itself is unconditionally convergent if and 
only \(Z\EX n \ < oo. 

c. If the series X] X n is essentially convergent，then the constants c n = 
jiX n + E{X n — ^X n ) c {where jiX n is a median of X n and c is some 
truncating constant) are unconditionally centering. 

Proof. According to the two-series criterion, essential convergence of 
the series X) is equivalent to convergence of the two series X) 户 [I X n 
一 fiX n I ^ c] and Y, ^ 2 (Xi — mD c , and then the series Y. — c n ) 
is convergent. Since the two series are convergent under all reorderings 
so is the series X) (X n — c n ) y and the assertion is proved. 

d. The series Yi X n is unconditionally convergent if and only if the series 
Z|/n-l| converges on some argument set U of positive Lebesgue measure. 

Proof. If H l/n 一 1 < oo on some U hence converges on U under 

all reorderings, then so does Il/n and the “if” assertion follows. 

Conversely, if the series X) is unconditionally convergent, then so is 
the series E (X n - c n ) and Z) | | 〈如 • Let = e^ nU j n {ti) so 

that, ior \ u \ ^ b y 

\fn{u) - 1 I = \f n {u) — I ^ |7n ⑷ - 1 | + 彡 | 

According to 22.2Bi and 22.2B 2 where we take t = c and center at a 
median /i — which does not change \/\ y for u ^ b sufficiently small so 
that log \ f n { u ) I exists and is finite，there exists a constant a such that 

\fn{u) - l \ ^ af I log|/ n (y) || dv. 

J Q 
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But the series X] d 一 c n ) being convergent, we can take b sufficiently 
small so that H |/n(") | > | for | “ | ^ b y and then 

E \fn{u) _ 1 I 蠤 ⑷ Og 2 + E I 
The “only if” assertion is proved. 

Let XI Tj be an arbitrary partition of the set of integers (1 ， 2, • • •）. 
The series X] ( Z) ^k) is a “partitioned summation” of the series X] X n . 

j kCTi 

e. If the series X! X n is unconditionally convergent then it converges a.s. 
to a same limit r.v. under all reorderings and partitioned summations^ and 
so does any of its subseries. 

Proof. If X) \ — 1 j < oo then Z \/ n f M — 1 | < for any 

subsequence n f hence, by d, the subseries X] Us unconditionally con¬ 
vergent. 

The difference between the series X] and the reordered series 
X] X f n is defined on the independent r.v/s X n and is independent of 
Xu * • •, for any n. Therefore, by the zero-one law, this difference 
degenerates into a constant c. Since X] |人 一 1 | is absolutely con¬ 
vergent so that Il/n = IT/n ， it follows that r = 0. Similarly for par¬ 
titioned summations because of the elementary inequality I II/a ： 一 

• j 

1 I 刍 H H |A 一 1 I valid for arbitrary complex numbers/fc (“） bounded 

•一 i ^CTi 

by one (proceed by induction). 

40.2. Three parts decomposition. We separate decomposable proc¬ 
esses into three parts: a numerical function, an interpolated series of 
fixed discontinuities and an a.s. continuous part. More precisely 


A. Three parts decomposition theorem. Every decomposable proc¬ 
ess is the sum , 

Yr P = xt Xt^ *4 - Xt c 


of three independent parts {not necessarily all present ) : 

(i) A centering junction xt. 

(ii) A fixed discontinuities part Xr d y centered decomposable、with almost 
all sample functions continuous except at the countable fixed discontinuities 

set. ^ 

(iii) An a.s. continuous part Xt\ centered decomposable, with almost 

all sample functions continuous except for countable sets of jumps. 

We proceed in steps and, first，we have to give a meaning to the term 
"centered.” 
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Let a y b y be the extreme points of the closure T of T and let T a ^ be 
this closure with a and/or b excluded unless they belong to T. We in¬ 
tend to show that we can delete from Xt its degenerate discontinuities, 
if any, by including them within a centering Junction x f r so as to leave 
a centered decomposable process a process such that for every 

/ C T a b which is a left (right) limit point of T the a.s. limits 
exist and there are no degenerate discontinuities, that is, if X t -o,t = 
X t — 不一 o(A"u+o = 不 +o — X t ) degenerates then it degenerates at 
zero. Note that 

The fixed discontinuities set of a centered decomposable process is count- 
able and A^_o"+o = — degenerates at zero if and only if 

Xt—oj and Xt,t^-o degenerate at zero• 

The first assertion results from the continuity extension theorem. The 
second assertion results from the fact that 不 _o"+o = + 兄 ,<+0 

is the sum of two independent r.v/s, since = ^ 

and only if ft-oA u ) = 广 wc ， /u+o(“）= e+luc where r = 0 by definition 
of a centered process. 

a. Centering lemma. Let Xt be decomposable. There exist centering 
functions such that X f T = Xt — x’t is centered decomposable. The 
fixed discontinuities set of X f T is independent oj the choice oj the centering 
function and the set D a of its points to the right oj any s forms the dis¬ 
continuity set of the junction d s t = I fst(u) \ du y / > 

Proof. Let x f T be determined by the relation E Arctan {Xt — x’r)= 

n 

0. If T ^ G T a b then, from t S C. T and the equality (X nu — 

允 M 1 

X 8k ) + (X t ， 一 X 8n ^) = Xr — X Sl it follows, by 37.1a and b, that the 

n 

sequence X\ nJtl = X\ x + ^ 一 X f n ) converges a.s. to a r.v. 

A _ 1 

If — 0 and s n n —> / — 0 then both sequences can be re¬ 

ordered and combined into a sequence s n ] t so that, because of sep¬ 
arability, the one-sided a.s. limit r.v. X t -o exists; similarly for X t ^o- 
Since E Arctan X f T = 0 hence, by the dominated convergence theorem, 
E Arctan X f (± o = 0, it follows that all degenerate discontinuities and, 
in fact, all degenerate increments degenerate at zero. Thus the de¬ 
composable process X f T is centered. Since any change of centering func¬ 
tion changes any given discontinuity by a constant hence cannot reduce 
nondegenerate discontinuities to zero, it follows that the fixed discon- 
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tinuities set of X f r does not vary with the centering function. The first 
assertion is proved. 

The relation / 3 > t^h =/st/t,t^h implies that \f si \ hence d si is nonin¬ 
creasing in/. But ， setting/“(“i = E exp [iuX\ t \ 火 e have |/ 〜一 0 ，< +0 ⑻ | 
=1 for / C D s and for all u y and \ f t ^ 0tt ^ 0 (u) | < 1 for / C Z) 3 and an 
以 -set of positive Lebesgue measure in [0, 1]. Therefore, from |/^ | = 
\/st I it follows that d si is discontinuous at / if and only if / C D s . The 
proof is terminated. 

So far, T was an arbitrary set in R. In fact, we can and whenever 
convenient we shall take as domain an interval in the interval I a b 
whose endpoints a y b are those of T except that they are to be excluded 
from lab unless they belong to T. For 

b. Decomposability extension lemma. A decomposable process Xt 
can be extended on I a b with preservation of decomposability y of centerings and 
of type provided the type is invariant under additions and passages to the 
limit. 

Proof. Let X f r be the process after centering. Set Xt = for 

right limit points / G Iah - T. The remaining points of I a b — T form 
intervals {c y d) or [c y d) where the d are right limit points belonging or 
not to T. Set Xt = Xd on such intervals. The process so extended has 
the asserted properties. 

c. Centered sample lemma. Almost all sample functions of a centered 
decomposable process X f r are bounded on every set [c y d\T y c y d C. T y and are 
continuous except for countable sets of jumps outside the fixed discontinuities 
set. 

Proof. We can take T to be [c y d\ without restricting the generality. 

1° Let be a median of X f d — X’ s n ] t then every limit value 
of the sequence is a median of X 、一 and similarly for s n | /. 

Therefore, the function p. t is bounded by some constant a. Since the 
decomposable process defined by Y t = X f t — X f c + is such that 
Yd — IVhas 0 for a median it follows by 17.1c that, for C = < • • • < 

d> P[sup I Y tk \ ^)8] ^ 2P[| Y d \^^] 

hence 

P[sup I + 2P[\ X f d - I ^ )8] 

k 

and, by separability, 
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P[sup \X\- X f c \^a + ^]^ 2P[\X f d - I W 】， 

t 

which implies the boundedness assertion. 

2° The second assertion will be proved if we show that one¬ 
sided limits exist for almost all sample functions. Let u vary over 
[ — 7, + 7 ] where 7 is sufficiently small so that f c d{t4) ^ 0. Since 
J\tW)f tdW) =/’cd (“) ， the same is true of all f c t[u). Then the Z t {u )= 
exp [iuX f d]/f ct{u) are bounded (complex-valued) r.v/s. Since a.s. 

E(Z t (u) I Z r (u) y r ^ s) = exp \iX\ a u\f\ t {u)/f\ a {u)f\, t {u) = Z 8 (u) y 

the Z t (u) form a (complex-valued) martingale for every u C [ — 7, *+* 7 ]. 
But, the decomposable process X f T being centered, the functions f c t{u) 
of t have one-sided limits. Thus, to prove that almost all sample func¬ 
tions have one-sided limits, it suffices to show that the same is 

true of every martingale Z t {u). If these martingales are separable, this 
follows from 36.1C. However, the martingales may not be separable. 
But, X f r being separable, it suffices to prove the assertion for its re¬ 
striction X f s on a separating set S and hence for the restrictions Zs(u) 
of the martingales on this countable set, and then what precedes applies. 
The proof is terminated. 

For the next proposition, it will be convenient to think of T as an in¬ 
terval. Let {tj\ be the fixed discontinuities in [a, i]Tof a centered decom¬ 
posable process X*t and let U f j = X f tj ^o t tp 厂 〜 •= 义、 a+o; at most 
one of these one-sided jumps may degenerate and then it will be at zero. 
I.et Uj = U f j — Cj where Cj = nU f j + E(U f j — m U f j) c and let Vj be simi¬ 
larly defined in terms of V、. Since theone-sided jumps are all independent 

n 

_ • 

and, setting X) + Y n = X 、 一 X’ ay lemma 37.1b applies, and the 

same is true for the it follows, by 37.1c and e, that the series X) 

id 

and X) are unconditionally convergent for every interval I CL T and 
id 

converge to the same a.s. limit under all reorderings and partitioned sum- 
mations. Thus, selecting arbitrarily i 0 C. T and setting 

Xt d = z Z Vi or - Z Z Vi 

to^<t t<tj<h t^<to 

according as / ^ or / < /。， the X t d are a.s. determined and we can and 
do select them so that Xr d be separable. 

d. Decomposition lemma. A centered decomposable process X f r is the 


sum 
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X f T = X f/ T + Z/+ X T C 

of three independent parts \ a centering function x ff Ty ^ centered decomposable 
process Xr d corresponding to the fixed discontinuities of X、, and a centered 
decomposable process Xt with no fixed discontinuities. 

n 

Proof. If {/j}, then a.s. X 3n+ i d ^ X Sl d + X] i^sk+i d ~~~ X sk d ) ， 

A? 3SS 1 

hence X Sn d —> X t d . Similarly, if s n T /j, then X 3n d X tj d - Uj. 

a . s . 

In the same manner, if s n [t ^ {/y}, then X 3n d —> X t d ； and if s n | tj y 

£1 S 

then -4 X t . d + Thus Uj = X ”/，= 

X tj ^o t tj+o d = Uj + F) nondegenerate, and Xr d is centered with fixed 
discontinuities set {tj}. If x’V is a centering function of X 、 一 Xr d it 
follows that the process Xt = X 、 一 Xr d — is centered decom¬ 
posable and has no fixed discontinuity points. The proof is terminated. 

e. Discontinuous part sample lemma. Almost all sample functions 
of the centered decomposable process Xr d are continuous outside the fixed 
discontinuities set [tj ]. 

Proof. We already know that for a centered decomposable process 
almost all sample functions are continuous except for countable sets of 
jumps outside the fixed discontinuities set {/)} • Yet, this does not imply 
that they are continuous except at the //s. 

To prove the assertion, we take T = [a y b] y denote by D the co-set of 
sample functions with discontinuities outside {/ ； j, and note that 
sup I Xt d (o)) I ^ € for those sample functions Xr d (o)) which have a dis- 

t 

continuity outside the tj at which their oscillation is at least 2c. Further¬ 
more, if a finite number n of //s with the corresponding Uj and Vj are 
deleted from T and from Xr d y the a>-set D € of these sample functions re¬ 
mains the same. Thus, to prove the assertion, we may assume this de¬ 
letion made (including a and V a if necessary). 

Since the series X] and XI 〜 are unconditionally convergent, it 
follows from the three series criterion that 

= I> 2 ( W + 〆/) — o 

j>n j>n 

and, for n ^ n(e) sufficiently large, 

E {PlUj^ Uf] + P[V^ Vf]\ < ,, S {| EUf) i + I EFf\} < e 

j>n j>n 
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Let Yt be a process defined by means of the Uf、Vf as Xr d was de¬ 
fined by means of the Uj y V j y with same deletion. According to Kol¬ 
mogorov^ inequality, if a = s 0 <•••<、 then 

P[sup I y 3 , - EY Sj I > e] g a 2 Y Sm /e 2 g Z W 

j 刍 m j >n 

hence 

P[sup I X 3j d I > 2e] ^ 6 + E W 

j 客 m j>n 

and, by separability, 

尸 [sup I I > 2e] ^ 6 + I ： <Ti 2 /e\ 

t i >n 

Since the bracketed condition defines an event which contains the set of 
sample functions with co C D e and we can let « —► oo, it follows that 
D t C ： with ^ e. If 6 n = e/2 n y then D Cn T Z) = (J D Cn C |J 

= B with PB 刍 X € n = €• Thus the set D is contained in events of pr. 
at most equal to e > 0 arbitrarily small and hence is contained in a null 
event. The proof is terminated. 

Upon gathering the foregoing lemmas and setting xt = x f r + x’’t 、the 
three-parts decomposition follows. 

40.3. Infinite decomposability; normal and Poisson cases. We pro¬ 
ceed now to the analysis of the a.s. continuous decomposable parts. To 
simplify the writing, unless otherwise stated, the domain T will be an 
interval [a y b] and the processes Xt will be represented by their r.f's de¬ 
termined by the condition X a = 0, so that X a t = X t and f a t = ft. We 
shall also use the following notation: partitions of [a y /] are given by 
a = s n0 < < s nn = t with max (s n k — 』 n ， A ； —i)—0 and the incre- 

k 

ments X n k = X Snk — X Sn ^ x have for d.f.’s F n k and for ch.fVs f n k- 

a. Continuity equivalence lemma. For a decomposable process Xr y 
T = [a y b\ y continuity in law> in pr. y a.s. y are equivalent and are uniform 、 
and imply that the process is centered. 

Proof. Since continuity a.s. implies continuity in pr. which implies 
continuity in law, the equivalence assertion will be proved by showing 
that continuity in law, that is, continuity of the ch.f/s ft in / implies a.s. 
continuity. Since the fixed discontinuities set of the centered process 

X f r = Xt — is the discontinuities set of the function ^ = I \ft{u) \ du 
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and ft is continuous in /， it is empty. Therefore X f T is continuous a.s. 
hence in law, that is, the function e^ XUXt f t {u) as well as the function ft{u) 
are continuous in /• Thus the centering function is continuous, hence 
Xt = + X 貧 t Has no fixed discontinuities and is centered. The uni¬ 

formity assertion follows from 35.3a, and the proof is terminated. 

Because of the above lemma, we drop “a.s.” in “a.s. continuous de¬ 
composable process” and write c.d*'process 、 for short. 

We say that ^t(x) is continuous in / if > ^t(x) as / —> / for 

every continuity point x of the function 义 ， / C 

A. Continuous decomposability theorem. Let Xt 、T = [a y i?] y i?e a 
separable c.d. process. Then 

(i) Xt is infinitely decomposable and log 力 =ypt — (a “ 0 t 2 y L t )= 
(a ty ^ t ) with at continuous and ^t(x) continuous in t and nondecreasing 
in t and x. 

(ii) Almost all sample functions of Xt cire bounded and are continuous 
except for countable sets of jumps、and | L t {x) | = Ev t {x) is the expectation 
oj the number vt(x) of jumps of these sample junctions in [a y /) of height less 
than x < 0 or at least equal lo x > 0. 

Proof. 1° Since Xt is uniformly continuous in law，/ S3 ' — 1 uni¬ 
formly in s y s f 3.s s — s f — ► 0 so that, taking partitions of [a y /], Xt = 

n 

53 X n k is sum of uan independent r.v/s. Therefore, by the central 

limit theorem, Xt is infinitely decomposable and log 力 =ypt — (m ， 

=(a“ ^t)- Since log 力 is continuous in /, it follows from the con¬ 
vergence theorem 22.ID that a ty ^t(x) are continuous in /. Since 
\ogf 8t = yj/ 8t = (a 8ty ^st) with ^st( x ) nonnegative and nondecreasing in 
x y the function is nondecreasing in t and x. 

2° Let x < 0 be a continuity point of L t (x) y take partitions of [a y /] 
and set v t (n) {x) = ^ h^nk <x\- Consider the (almost all) sample func- 

k 

tions which are bounded and continuous except for countable sets of 
jumps, according to the centered sample lemma 37.2c. 

Then Pt (n) (x) —^ v t (x) while, by the central limit theorem, 

n n 

Ev t (n \x) = L P[X nk < 4 = ZFnM ^ L t (X) 

釦釦 =1 

<r 2 (pt in) (x)) = E F nk (x) - J ： F nk 2 (x) ^ E^ n \x) y 


and 
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so that the sequence of second moments 

E(v t {n) (x)) 2 ^ Ev t {n \x) + (Ev t in) (x)) 2 —► L t (x) + L t 2 (x) 

is bounded. Therefore Ev t (x) = lim Ev t {n) (x) = L t (x) y and the non¬ 
decreasing and continuous from the left functions Ev t {x) and L t (x) 
coincide for all continuity points x < 0 of L t {x) hence for all x < 0. 
Similarly for x > 0 y and the proof is terminated. 

Brownian and Poisson cases. Leaving out the trivial degenerate c.d. 
case which corresponds to vanishing ^ t (x) y the foregoing theorem yields 
properties of two extreme cases corresponding to 伞，⑷ with one fixed 
point of increase only ： The normal c.d. process corresponds to the point 
of increase x = 0, that is, to vanishing L t {x) y so that 


^t(u) = tatu 




2 


U 


2 


with a，continuous and 0t 2 continuous and nondecreasing. The Poisson 
c.d. process corresponds to the point of increase x = r 〆 0 and, setting 

= L t (c + 0) — L t {c — 0), y t = at -^ 入 “ to 

1 + r 

令 t(“） = iytu — \t{e iuc — 1 ) 

with yt continuous and 入 ， continuous and nondecreasing. The 
“reduced” forms obtained by centering and changing the instan- 

. # (T 2 / 

taneous time-scale are called the Brownian process 、 yf/ t (u) = - u 2 y 

<r 2 > 0, and the Poisson process^ y// t (u) = \t{e iu — 1), X > 0. We denote 
by £(a, 泠 ） a law with two values a and )8 only. 


B. Normal and Poisson c.d. criteria. Let X Ty T = [a y b] be a cJ. 
process and take partitions of [a y i]. 

(Dl) Xt is，a normal c.d. process if and only if almost all its sample 
junctions are continuous y or X ab is a normalr.v. y or £(max | X nk |) —► J^(0). 

k 

((P) Xt is a Poisson c.d. process ijand only if almost all its sample Jo 
tions are step-junctions with jumps of constant height、or X a b is a Poisson 
type r.v.y or £(min X n k) ~ ^ £(0) and £(max X n k) — ► £(0, c) where c > 0 

h h 

or £(min X n k) —► £(^, 0) and £(max X n k) —► 公⑼ where c < 0 9 你 "A 

k k 

Xab lattice-valued. 
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Proof. In (31) and in ((P) the sample functions assertions follow from 
A(ii), the normality and Poisson type assertions follow from A(i) or from 
the composition and decomposition theorem 19.2A, and the extrema as¬ 
sertions follow from the extrema criterion 22.4C. 


Brownian processes are born from and are used in physics to describe 
motions of particles such as molecules of a gas. However, by their very 
nature, they are first approximations only, for, while almost all sample 
functions are continuous, they are extremely irregular in nature. They 
are investigated in detail in the next chapter. 

Poisson processes are also born from and serve to describe various 
physical phenomena such as radioactive disintegrations. In fact, we 
can and do consider only the step sample functions (whose jumps cor¬ 
respond to disintegrations) taken to be rightcontinuous. Then 

b\ Poisson sample jumps lemma. Let Xr y T = [0, oo), be a Poisson 
process. 

(i) The c. law of jumps in / = [j, j + /) given that n occurred is that 
of n independent r.v. y s uniformly distributed in L 

(ii) The times r n (ro = 0) between the {n — \)th and nth jumps are in¬ 
dependent r.v.'s with P[r n > /] = e 一 Xt . 

Proof. The numbers of jumps Y and Z in /’ = [s\ s f + /’）C / and 
in / — are independent Poisson r.v.’s with parameters 入 /’ and 入 (/ 一 /’) 
and y + 2 is also Poisson with parameter 入八 Thus, by elementary com¬ 
putations, 

P(Y = m\ Y^Z = n)= P[Y = m, Z = « — m]/P[Y + 2 = «] 

=C n m (/7/) m (l - /// 广,， 

and (i) is proved. Therefore, taking / > /i + , 2 ， 


P(ri > /i, T2 > h 



and P[t\ > /i, r2 > ^2} = + \tp + (X/p) 2 / 2 ! 

similar computations yield independence of any number of rs y and (ii) is 
proved. Or, note that 4L1A case 2° or 38.4A corollary applies to Xt 、 
set <r n = ri H - H T n ^.i y and 
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Pijn > t n r n _i, r n _ 2 > …） = — = 0 <r n ) 

= P[X tn - X 0 =0} = e^ u \ 


Stationarity and law derivatives. The ch.f/s of the increments X tA+k 


of Brownian and Poisson processes are ft,t^h = fh = e 从 with yf/(u) 

<r 2 2 • 

=~ « and yp(u) = 入 — 1). Thus, these processes are stationary 

and (equivalently) have stationary law derivatives, according to what 
follows. 


The general concept of stationarity along an index set T is that of 
invariance under translations along T or, to use an intuitive terminology 
when T = [0, <»), of invariance under translations in time. Thus, a 
process Xt on T = [0, <») is stationary if its law of evolution is stationary 
under translations in time. For a centered decomposable Xt it suffices 
that the individual laws of its increments X tt t+h be stationary: ft t t+h = fh 
for all h > 0. For, it follows at once that the joint laws of its increments 
hence its law of evolution are stationary. But then, by decompos- 
ability’/A+A =/ a /a for all h y k > 0 so that/^ = f kin n for n as large as we 
wish. Therefore fk = e “ is infinitely decomposable and ► 1 as 
h 0. Thus the relation becomes ypk+k = —r 令 k with ^ ► 0 as 

A —► 0. It follows that a stationary centered decomposable is a c.d. 

process with/"# = fh = yp = (a, *)• 

We say that a ch,f, ft represents the (right) {left) law derivative at / of a 
centered decomposable Xt if (ft-h t t^k) [ll(Kh ~^ k)] — ► 力 as A + 走一 ► 0 with 
h y k 0 y h k > 0 (h = 0) ( 走 = 0). According to the central conver¬ 
gence theorem, when the limit ch.f. exists, then it is necessarily an Ld. 
ch.f./t = e^ l y \f/ t = (aty 4ft) and the process is (right) (left) law continuous; 
in particular, if Xt is stationary then f t <— (e h ^) llh = so that the law 
derivative exists and is stationary• Thus, in searching for conditions of 
existence of law derivatives, we may assume that Xt is a c,d. process 
with f 8t = e^y \// a t = (a 8ty ^ s t)- Then it follows from 22,ID that the 

law derivatives exists if and only if - ypt—h ， t+k — It or, equivalently, 

1 1 々 + 走 

- 7 ^t^hu-^k —► and - 7 a t -h,t^-k —► ot t . In particular, if the 

h + k h 七 k 

law derivative exists and is stationary hence for every / and every u the 
derivative yf/(u) of ypt{u) exists and is independent of /, it follows that the 
process is stationary• 


We collect what precedes into 
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C. Stationary decomposability criterion, A centered decomposable 
process Xr y T = [0, <») is stationary if and only if the chj. y s of its incre¬ 
ments are of the form ft ， t+h — fh — yp = (a, or y equivalently y its law 
derivative exists and is stationary. 

Integral decomposition. A look at the P. Levy form of yp t : = ( a b 0t 2 y 
makes one think of a c.cL process as a “sum” of a normal c.d, process and 
of the Poisson c.cL processes corresponding to all points of increase of 
the P. Levy functions. In fact, P, Levy showed that in a sense it was 
true and then obtained the general form for infinitely decomposable laws. 
I to made this analysis precise. We shall give the P. Levy-Ito result but 
proceeding/row the general form of the infinitely decomposable laws—to 
be specific, from the continuous decomposability theorem to which it 
led — to the reconstruction of c.d. processes by means of their normal and 
Poisson “components,” as follows. 

Let Xt be a c.d. process, use the notation introduced in this subsec¬ 
tion, and take partitions of [a y /). 

c. Number of jumps lemma* The numbers p 8 t[^yy) of jumps in [j, /) 
of height in [x y y) y xy > 0, are Poisson r.v's with parameter L 8t [x y y) in¬ 
dependent for disjoint time-intervals [j, /) and independent for disjoint 
height-intervals [x y y). 

Thus, the processes (vt( x )y ^ C T) y x 〆0， are Poisson c.d. processes 
and the processes (^(x), x C (一 °° ，0 ) U (0, +°°))> / C are Poisson 
decomposable. 

Proof. Independence of the v st [x y y) for disjoint time-intervals [s y /) 
results from independence of the corresponding increments X st on which 
they are defined x < y < 0, 

For the x y y C. C(L t ) the continuity set of L t with x < y < 0 y 
ip t in) (u) = E exp [iuv t {n) [x y y)\ = II 五 exp \ iuI [x ^Xnk<v\\ 

k 

=n n + ( 户一 i) 心 k 州 

u 

where ma\F n jc[x y y) —► 0 and ^ F n k[x y y) —► L t [x y y). Upon taking n 

k h 

sufficiently large and setting <pt (: u) = E exp \iupt[x y y)} y it follows from 

後 . SttS« 

vt n) [x y y) — > v t [x y y) that 

log <p t (u) log (pt in) (u) = (1 + ^(l))(^ u — l)^Fnk[x y y) 

—( 户一 \)L t [x y y). 
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Therefore, upon letting 1 x along C{L t ) when ^ C C(L t ) y and similarly 
when y C C(L t \ the result is valid for all x y y < 0 y by continuity from 
the left of L t ; similarly for all > 0. The Poisson r.v.’s assertion fol¬ 
lows. In fact, it is an immediate consequence of the first criterion in 
B((P). We gave a direct proof because the one below generalizes it ： 

If [xj y yj)y Xjyj > 0, are m disjoint height-intervals and all the xy ， 
yj C C(L^) then, as above, for n sufficiently large, 

log £ exp {E iujvn^yj)] = L log {1 + (A 一 1) E F nk [x h yj )} 

i i k 

= d + o(\))Z(e iu J^l)ZF nk [x jyyj ) 

i k 

—> Z) {e xu i — 1) L t [xj y yj) } 

j 

and the restriction to continuity endpoints is removed as above. The 
Poisson r.v.'s and independence assertions follow, and the proof is 
terminated. 

d. Integration lemma. The a 上 integrals 

x 

{x dv t {x) - —— 2 dL t {x)\ 

-00 1 + X 


exist and are i.d. r.v's with 
log E exp [iul t )= 




dLt(x). 


Proof. We drop the subscript / and consider only the almost all 
sample functions for which p[a y b) has the properties stated in A(ii)，so 
that we drop “a.s.” 

Let a<^<0. If the jumps of the step-function v(ca) } [a y 0)) occur at 

n(w) 

Xi((jo) } • • • ， x n ^) (co) then I a ^(^) = \ x dv{(j) y x) = ^2 xjc (co) defines a finite 

^ ct k=-\ 

function I a ^ of a). Set x n ic = a + 走 (/? 一 a)/2 n , v n k — v[x n k y x n j+i)，and 
L n jc = Evfijc = L^Xyijcy • Since S n ^ ^ ^nk^nk T I<^y ^ follows that 

the integral ij is measurable, k 

log E exp {/M n } 〜 ；E ( 户 ^ - \)L nk ^ (e iux - 1) dL{x) 

k J <x 

=log E exp {/«//], 

and this integral is an i.d. r.v. We can define the integral 1^ by 
//T/^asal 一 oo or directly by means of partitions of [a ny 0] where 
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a n I —oo. In either case, it follows that 


is an i.d. r.v. with 


log E exp [iul_j\ 

0 I +00. 



AUX 


一 1) dL{x). Similarly for a y 0 > 0 then 


We cannot do the same with, say, 1 一 』 as jS 丁 0, since 



,%ux 


1 ) dL{x) 


may not exist. However, as T 0, by the convergence in q.m. criterion, 
there is a limit in q.m.: 


⑴ /-i 0 = 

since as 卢，占 ’ T 0 



x d\v{x) — L(x)} i| x d{v{x) — L(x )}, 



一 o 



xx f dd f E{p(x) — L{x )} {^(^) — ^L{x f )} 



x 2 dL{x) < oo # 


But the left side of (1) is a martingale in /3 | 0 or may be considered as a 


sequence of consecutive sums of independent summands = 
with 0 n T 0 . Either way, the limit in q.m. is also an a.s. limit and 



n+i 


log E exp 



AUX 


一 tux) dL{x) 



{e iux — 1 — tux) dL{x) 


log E exp 


similarly for | x d\v{x) - L(x)\. Finally, upon adding to I x dv{x) 

J +0 ^ 一 GC 


the finite constant — | - - dL{x) and to I x d{v{x) 一 L(x)} 

J_oc 1 + x 2 J-l 

r~° x z . . 

the finite constant — | - ^dL{x) so as to obtain under the in- 

1 + 〆 

x . • 

tegral signs the same expression x dv{x) — -- - dL(J ) 、 similarly for in- 

w 1 + x 

tegrals over (0, 1) and [1, <»), and adding the four integrals, the lemma 
follows. 
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We are now ready for the "integral decomposition’’ of c.d. processes. 
We go back to our partitions of [a y /) and recall that the X nk are uan in¬ 
dependent (in k) r.v.’s with ch.f；s f nk = ^ nk = (a nky 'D and 
max I if/ nk (u) I 0. Let = 0. Since ^ ^ where ^ = (0 , 伞山 

k k 

it follows that there exist a finite c(u) such that 


E \U(u) - i| =E| ，刪 -i| = (i + o(i)) El Mu) I 

k k k 


^ (1 + o(l)) 



Thus, 


iuy \ 1 + y 2 


-J- y^/ 


y 


2 


d^nh{y) 


^ (1 + o(\)){c(u) Var 


S I fnkiM) — 1 2 ^ max fnkiM) — 1 I I JnkiM) 一 1 ^ 0 

k k k 

and therefore 


H (fnk (“) 一 1) = (1 + 。⑴) 5Z ^nk{^) —► 少 t(“) • 
k k 

D. Integral decomposition theorem. Every cJ. process X Ty T = 


\a y b] y is sum of two independent processes 


Xt ^ vt + 



+00 


•00 


X dvr (x ) — 


X 


+ X 


2 


dL，T {^) 


where rj T is a normal cJ. process，and vt has the properties stated in the 
number of jumps lemma\ and conversely. 

Proof. The converse is immediate and, by A and c, the direct assertion 
requires only the proof of the independence and normality parts. Since 
the integral process values I t are defined on the process {v t {x) y x ^ 0 )， 
it suffices to prove that r) t and v t {x) are independent for ^ C C(L t ). The 
proof is the same for x < 0 and x > 0. Let, say, x < 0. We can and do 
assume that at = 0, without restricting the generality. 

If 

InM = f \xdv t {n) {x ) - . . . 2 dL t {x) 1 

l 1 + X 2 J 


then it is easily verified that I nt (心 1 1 as n —► oo then € — > 0 so that, 
setting 7] t = Xt — Ity it follows that 


and 


YnM = x t — I nt (e) ^ m 
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<PnA u y v )= 五 exp {tuY nt (e) + ivv t {n) {x)\ —► (p(u y v) 

=E exp \iu7] t + ivp t ( x )) - 

We drop /, write v n in lieu of p( n )，and can and do takex* < — c, dbe C C{Lt). 
Since 

广一 e 一 o 广 +°° 

I X dI[X nk <x\ = X nk I[Xnk<-Ay I X ^[Xnk^x] = ^ ^nkI[Xnk>^y 
^ — oo je+O 


we 


have 


and 


7n(€) = Z XnhI[\Xnk\ <«] 

k 






dL{y) 


exp \ t 


lu I 一 




dL{y) i * IX ^ eX P \ ^ U ^rikI[\Xnk\<^\ + ^[Xnfc<-r]}- 


Elementary computations yield for the factors in the expression 
1 + UnM - 1) - f ( 々 - 1) dF nk {y) + (e iv - l)F nk (x). 

^ lj/1^ 6 


Since 


max|/ n A；(«) - 


0， max I dFnkiy) 

k ^ lj/1^ 6 


0 ， max F nk (x) 


― > 


it follows that for n sufficiently large 


log pn«(« ， y) = ^ dL{x) 


+ (i + ^(D)|z UM - 1) - E f ( 々 - 1) 

l k k J \y\^ « 


+ (e tv — 1) F n jc(x) 


Therefore, as ?7 — ► 00 then e 


p- • 

log <p(Uy v) < — log <Pnt( u y v ) - ^ — W 2 -f* (e tV — 


The independence and normality assertions follow, and the proof is 
terminated. 
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COMPLEMENTS AND DETAILS 

/// rj:s are selected to be separable on some unspecified but fixed pr, space. 

1 . Let Xt = ^ 0) be separated for closed sets by S. Set Xt = liminf X t f y 

Xt^o = liminf X”. r — « 

( 干 o 

a) If Xt = X t outside a null event N i} then the r.f. (X h / ^ 0) is equivalent 
to X T} is separated for closed sets by S } and is a Borel r.f. with almost all sample 
functions lower semi-continuous. What if the limits are defined in terms of 
deleted neighborhoods? 

b) If J 卜。 = Xt^o = Xt outside Nt y then the r.f. ( 石 +❶， / ^ 0) has the above 
asserted properties except that lower semi-continuity is replaced by right lower 
semi-continuity. 

c) What if liminf is replaced by limsup? 

2. If there exist constants c y p y q 0 y r > 0 such that 

E\ X tx — X h \ p \ X h - X H \ q < c\t x - / 3 | l+r , 0 S h < t 2 < h ^ \ y 

then almost all sample functions of (X ty 0 ^ ^ 1) are continuous except per- 

haps for simple discontinuities. 

3. Let Xt } T = [0, oo) be a Borelian stationary r.f. 

a) If is integrable then, for every « d i ?， 

if e -inB x a ds ^ integrable, 
t Jo 一 

= 0 (or u 9 ^ v and ^ degenerates at zero at every u except for a countable 
set of values of u. 

If X 0 2 is integrable, then the convergence is also in quadratic mean and 
E^u^v == 0 for u 9 ^ v. What if above e ^ XU3 is replaced by (e ^ iu3 — e^ iu ^ h ^)/iu^ 

4. (S> T and (Br+ times. Let (S>t = (®<，/ ^ 0) (Br+ = (®<+o, / 2 0). 

a) 1 ( a y r, T n are (Br-times then so are + t and sup r n ; in particular, limits of 
nondecreasing sequences of (Br-times are (Br-times. 

b) If r is a (Br-time, then [r < /] ^ (S> t for every /, but in general, the con¬ 
verse is not true unless ©r is rightcontinuous. Example: Let (X h (& h t ^ 0) 
be a Brownian motion. Then r = min {/； X t = 1} is (Br-time, r f = 
inf{/: ^ > 0} is (Br+ (but not (Br)-time. 

c) is rightcontinuous; to simplify the writing assume that (Br is right- 
continuous. If r n are (Br-times then so are lim inf r n and lim sup r n . If, more¬ 
over, Tn i r then (B r = PI (B rn . 

n 

d) Let (X t} (R ty / ^ 0) with Xt sample rightcontinuous (7" = [0, «>)). For 
Borel S C R set 

rs = inf{/: X t d or 00 , 
according as this set is nonempty or empty. 

Let U (Z R be open. Then r v c is (Br-time and，when (Br is rightcontinuous so 
is Ty 
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e) Let the foregoing Xt be rightcontinuous with sample left limits and let (Br 
be rightcontinuous. Let every (S> T contain all subsets of all null events. Then 
r s is (Br-time for every Borel S. 

5. Let (Xt } (Br), T = [0, 1], be a martingale. 

a) Let Xt 2 be integrable and set Ft = EX t 2 > Then Xt has orthogonal incre¬ 
ments for s < /, E(X t — Xs) 2 = Ft — F S} and the fixed discontinuity points of 
Xt are the discontinuity points of F t , (Xr 2 y (Br) is a submartingale. What 

about {X T — Ft, (Br) ? Is it a martingale if and only if, for j < /， 

E((X t — X) 2 1 ® a ) = E(X t - X.Y a.s. 


If Xt has no fixed discontinuity points, what does the change of parameter to /’ 
given by / = F{t f ) do? 

b) Let Xt be sample continuous. By partitioning suitably T y find Ju(Xt) 
under various hypotheses which permit the use of results and/or methods of §28. 
When is & {Xt) normal? Can it be Poisson? 

6、 Second order integral decomposition. Given a second order r.f. Xt with 


EXtX f t* — JJgMgr(s f ) dd f y(s y s f ) where y(s y j r ) is of bounded variation on 
every finite square, there exists a second order r.f. ^(s) with = 


AhA f h*y(s y s f ) such that Xt = Jgt d%. 


Set (X t9 X t f ) = (gt } gr) y let c^s be constants, L(X) be the family of all linear 
combinations 53 c k^t ky L{g) be the family of all linear combinations X) ^kgt ky and 
let L^{X) be the closure in q.m. of L{X) and L y {g) be the 7 -closure of L{g). 

a) Extend the correspondence Xt / C T y to L(X) and L y (g) preserving 

linearity and then extend it to L^{X) and L y (g) preserving inner products. The 
correspondence between L^{X) and L y {g) preserves linearity and inner products. 

b) If all indicators I[ a ,b) €1 L y (g ), denote by ^[a y B) the corresponding elements 


of [ 2 (^ 0 . Then Xt = JSt ^ 


c) If some indicators I[a t b) L y {g) introduce a set T f whose elements t are 
the missing indicators, extend g on T + by taking gt y t G T\ such that 
{g iy gt*) exist and are finite for 、〆 d T Cl T ’， extend Xt on T U by taking 
X ty / G so that (X h Xr) = (gtygr) y /, /' G 7" U T and Xr independent of 
Xt } and apply If). 

d) Give as particular cases the decompositions theorems in Section 34. 

7 . Let E\ d^{s) | 2 = dF{s) and let g t (s) be measurable with respect to the 
dt -product measure with | gt(s) | 2 dF{s) < 00 for almost all /• 


a) The r.f. Xt with Xt = JV 心）炎 ( J ) can be selected so as to be measurable: 
Start with gt(s) = 如 ( J ). 


b) If jdF(s) (Jk 心 )I A) < % I 2 JF(s)) < 00 , then the 

iterated integrals Jd^{s) ( 心 ） (where the bracketed integral is selected so 
as to be measurable) ， JV/ 炎 CO) exist, are a.s. equal, and denoted by 
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dt d^s). 

c) Let g\t) be the derivative of g{t\ t C [a y b] y and g t (s) = g f {t) or 0 according 
as / ^ j or / > Then, for F continuous at a and b 


fjjM dt 炎 w = (g(^) — g(a))^[a y ^) — J* ^[a y t)g\t) dt. 


What if F(s) is not continuous at a and b} 
S. Let ^(s) be a martingale with 


^(| | 2 > ^r y r S s) = E\ /) I 2 = F[s y t) a.s. 


a) The family of r.f/s G(s) with ^E\ G(s) \ 2 JF(s) < oo contains those which 

are measurable with respect to the ds ^/P-measure with every r.v. G(s) being 
®(^ r> r ^ J)-measurable. 

If ^(s) = 7j(s) — s where 7](s) is a Poisson r.f. with X = 1, then E d^{s) = 0, 
E{d^{s)Y = dt, and 



= U^ y b)Y - b). 


If ^(s) is a Brownian r.f. with a 2 = 1, then 



i (钉这 ，勿 ) 2 — § (々一 设). 


b) R.f/s Xt = (X ty l a) y X t — j G(s) d^{s) are square-integrable martin- 

gales whose almost all sample functions have one-sided limits at all points and 
the fixed discontinuities are discontin ities of F(s). If almost all sample func¬ 
tions of 芒 (j) are continuous so are those of Xt* 

c) If Xt } T = [a y b\ is a square-integrable martingale whose almost all sample 

functions are continuous and£(| X 3ti | 2 | X ry rS J) = j£( J* | G(s) | 2 ds \ X ry r S - 
a.s., then there exists a Brownian r.f. ^(s) , a S s S such that X a% t = I G(s) d^{s) 

J n. 


a.s.; however，the pr. space may have to be enlarged to a product of the 
given pr. space with an appropriate one unless G(w, s) vanishes almost nowhere 

on (o) y /)-space. What about a converse? 

P. Zero-one law. Let Xt } T = [0 , ①）， be decomposable. If ^ is a tail r.v. 

on the family of increments Xt — X 9y s y i ^ 0 y then 专 is degenerate. 

10. Strong law of large numbers. Let Xr y T = [0, °o), be decomposable with 
stationary increments and E(Xt — ^o) = 0. Then ^ Xq\ X 9 — Xo y 

j ^ /) = E(X\ — Xo I Xt — Xo) = (Xt — J¥"o)A a *s. and Xt/t 0. Extend as 

in Section 29.41 II. 

//. Let 义 (n 〉 = {Xt^ n \ / ^ 0), w = 1, 2, • • •, be independent Poisson r.f.’s with 
parameter X > 0. The pr. that all be constant in an interval of length 
h is e^ Xhn . Let r n > 0, The series Xt = ^2 c n Xt b) c) * * * * * * * * * * (n) is a.s. convergent. 

Are the following statements true? X = (Xt y / ^ 0) is a.s. continuous, strictly 
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increasing, and its moving discontinuity sets form an everywhere dense set in 

[0 , 吨 . 

12. Let X$t be the number of occurrences in [s } t) of purely random events ： the 

X 9t are a.s. finite for finite time-intervals and independent for disjoint ones with 
EX 9i = = 入 (/ — j)，the pr. of occurrence at any specified time and the pr. of two 
simultaneous occurrences are zero. The X“ form a Poisson process, and con¬ 
versely. 

13. Poisson distribution of particles. Let Xi y i - -1 ， 0, +1 ， • • • be nv.’s 

and let v x be the number of G /； /, with or without affixes, are intervals of 
length I 1 1 and Xi and vi have same affixes if any. The Xi represent (positions 
of) particles and the vj are their numbers in L The “distribution of particles” 
is the family of joint distributions o{vj xy . . • ， for all /i，. • • ， / n ，《 = 1 ， 2 ， • • • • 
The distribution is “Poissonian” if the w are independent for disjoint intervals 
I and Poissonian with parameter X| / |. 

Let the particles Xi(t) y t ^ 0, move “independently ”： for every fixed /, the 
increments Xi[0 y t) are mutually independent, identically distributed, and 

independent of all ^(0). Let Pt[a y b\ = P[Xi[0 y t) d [a y i]]. . • 

a) The Poisson distribution of particles is invariant in time (that is, if their 
distribution is Poissonian at time 0 then it is Poissonian with same X at any 

time /). 

b) If, as / for every A > 0 ， 


53 I Pt[^ y {n + l)h] — P t [(n — \)h y nh] \ —► 0 

n ■一 oo 

and £| ⑼ /j / |) - X | — 0 as ! / ! — «， uniformly in all intervals I of length 
/， for some constant X, then the distribution of particles converges to the Poisson 
distribution (the distribution “converges” to the distribution of particles [Xi] 
if the£(^/„ - ••，”„) converge to£(!?/, ， … ， P/J for all /i，••；，’*»，” = 1 ， 2, • • •)• 
What if X is a r.v.? What if there is no convergence as 1 / 1 — 的？ 

14^ Let (Xu t ^ 0) with 不 = 0 be a Poisson r.f. with parameter X > 0. 

a) Set Y t = X t - X/, Z f = Y t 2 - X/, U t = exp{ — r 足 + 入 /(1 一 d!〆 €1 尺 . 
Then (Y h (S> h / ^ 0), (Z t} (R h t ^ 0), (U h (S> h t ^ 0) are martingales. 

b) Given a positive integer m y let r m be the first time Xt reaches m. I hen 


入五 r m = m, <r 2 (rm)= 讲 / 入 2 , 


and, setting a = 一 Ml — 厂 c )， 

Ee— aT m = e mc = (X/ (X + «)) m ; 


£(r m ) is a r，law w : ^h parameters m and X. 


Chapter XIII 


BROWNIAN MOTION AND LIMIT 

DISTRIBUTIONS 


Brownian motion, born from and used in Physics, is of ever increasing 
importance not only in Probability theory but also in classical Analysis. 
Its fascinating properties and its far-reaching extension of the simplest 
normal limit theorems to functional limit distributions acted, and con¬ 
tinue to act, as a catalyst in random Analysis. 

It is recommended that at least portions of this chapter be read as 
soon as possible. At the cost of some repetitions its dependence upon 
other chapters of this part is minimized and properties, which obtain 
from deeper ones 一 established herein or elsewhere in this part, are also 
established directly on a relatively elementary level. 

§ 41. BROWNIAN MOTION 

41.1, Origins. In Physics, the ceaseless and extremely erratic dance 
of microscopic particles suspended in a liquid or gas，is called “Brownian 
motion. M It was systematically investigated by Brown (1828, 1829) 
— a botanist, from movement of grains of pollen in water to a drop of 
water in oil. He was not the first to mention this phenomenon and had 
many predecessors, starting with Leeuwenhoek in the 17th century. 
However, Brown’s investigation brought.it to the attention of the scien¬ 
tific community, hence “Brownian.” Brownian motion was frequently 
explained as due to the fact that particles were alive. Poincar6 thought 
that it contradicted the second law of Thermodynamics. Today we 
know that this motion is due to the bombardment of the particles by 
the molecules of the medium. In a liquid，under normal conditions, the 

order of magnitude of the number of these impacts is of 10 20 per second! 
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It is only in 1905 that kinetic molecular theory led Einstein to the 
first mathematical model of Brownian motion. He began by deriving 
its possible existence and then only learned that it had been observed. 
Let us consider the x-component only of the motion — the one-dimen¬ 
sional Brownian motion. Using formal analytic arguments and more 
or less explicit probabilistic ones, such as stationary independent in¬ 
crements, he derived a partial differential equation 


d P - T) 
dt dx 2 


for the pr. density p = pt(x) that the particle be at at at time /； note 
that by change of scale，this equation reduces to the “heat equation ’， 

dp 1 d 2 p 
dt 2 dx 2 


If the particle starts at x = 0 at time 0， then 


PM = 


_ - _ ^-x 2 IDt 

V2^Di • 


Using physical arguments, Einstein showed that the “diffusion coef¬ 
ficient” Z) = 2RT/N/ y where R is the ideal gas constant, T the absolute 
temperature, N the Avogadro number, and / the friction coefficient 
which depends upon viscosity and upon the particle properties. Soon 
thereafter，Perrin in a series of experiments based upon Einstein’s 
model obtained a 19% approximation of the Avogadro number. This 
led to the final acceptance of the kinetic molecular theory even by skep¬ 
tics such as Mach and Oswald. In fact, Einstein’s model was later 
replaced by a dynamic one of Ornstein and Uhlenbeck，initiated by 
Langevin’s first “stochastic differential equation ”； the interested reader 
is referred to Nelson and to Wax. Perrin mentioned that Einstein’s 
model produces nowhere differentiable continuous functions which, be¬ 
fore this model, were considered by most mathematicians as special and 
somewhat artificial constructs without much mathematical value 
as counterexamples. Thus, these “monsters” from whom Hermite 
“turned away with horror,” became “natural beings” of importance in 
Mathematics as well as in Physics. 

A completely different origin of mathematical Brownian motion is a 
game theoretic model for fluctuations of stock prices due to Bachelier. 
In his thesis，he hinted that it could apply to physical Brownian motion. 
Therein, and in his subsequent works, he used the heat equation and, 
proceeding by analogy with “heat propagation” he found, albeit for¬ 
mally, distributions of various functionals of mathematical Brownian 
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motion. Heat equations and related parabolic type equations were used 
rigorously by Kolmogorov, Petrovsky, and Khintchine (see §42). 

Rigorous definition and study of (mathematical) Brownian motion re¬ 
quires measure theory. Some 20 years after Lebesgue’s thesis, Wiener 
(1923) gave its first satisfactory construction and proved its almost sure 
sample continuity. In 1933， together with Paley and Zygmund, he 
proved nowhere differentiability of almost all Brownian sample func¬ 
tions. Meanwhile, Khintchine (1924) found the Brownian Law of the 
Iterated Logarithm. But it is in 1939 that P. Levy proceeded to an 
analysis in depth, so exhaustive, that since only improvements of details 
were obtained. In later works, he investigated multi-dimensional 
Brownian motion and then Brownian motions where the time interval 
was replaced by abstract space, especially by Hilbert space. 

It is strongly recommended that the interested reader attempt to pe¬ 
ruse the chapters on Brownian motion in P. Levy’s book (1964) which 
are extraordinarily rich in ideas and results. The interested reader is 
also referred to the remarkable monographs by Freedman and by I to 
and McKean. 

In this chapter, we shall study the basic case: one-dimensional Brown¬ 
ian motion. 

Because of its importance and its intrinsic value, we assign a special 
notation to Brownian motion Wt or W 、 with random values W t or 
according to convenience. The symbol “W” comes from ''Wie¬ 
ner. M Other symbols also in use are “w” and “5.” 

41.2. Definitions and relevant properties. There are several defini¬ 
tions of (one-dimensional) “Brownian” or “Wiener” or “Wiener-Levy” 
process or random function or motion. We shall proceed by successive 
refinements, based on required and relevant properties, to reach the 
final definition. 

A process Wt = {JV / C T 1 C is “Brownian distributed’’ if it is 
decomposable, that is, its increments JV 9i = — s < t y s y t T y 

on disjoint intervals are independent, and if 

= 91 (a (/ — j), <t 2 (/ —s)) y c 2 > 0; 

a is called the drift and a 2 is called the diffusion coefficient of Wt. 

A more restrictive definition obtains by adding the requirements 

PFo = 0 a.s. and T = [0, <»), 

equivalently: 
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A random function (r.f.) Wt = iW ^ 0) is “Brownian distributed” 
if it is decomposable and, for / ^ 0, 

= 31 (a/, (T 2 /) ， <r 2 > 0, 

that is，the ch. f.’s are given by 

fw t {u) — Ee iuWt = exp{/(a/)« — a 2 u 2 / 2 }. 

To simplify the writing and with no real loss, from now on we restrict 
“Brownian distributed” to the “normalized” or “standard” form ob¬ 
tained by taking a = 0 and o* 2 = 1: 

A r.f. Wt = ^ 0) is Brownian distributed if its disjoint in¬ 

crements are independent and 

St{W t ) =31(0，/)，/ g 0; 

w 

thus, £(^/V7) = 31(0 ， 1)，that is, 

PiWt/sft > a) = ^ f e-^dv, aCR. 

A. Brownian covariance criterion. A r.f. Wt is Brownian distrib¬ 
uted if and only if it is centered normal with covariance given by EW % JV t = 
s A t. 

Proof. Let JVrbt Brownian distributed so that all EJV t = 0 and, for 
any finite section {W • • • ， Wt m ) with /i < ••- < by decomposa- 
bility, 

Etx^>{iu\fV q + • • • + iurrJV i m } 

= Etxp{i(ui + • • • + Un)W tl t 2 } X • • • X Etx^{tUrJVt m _ iy t m ] 

hence，the increments Wt kt t k ^ being normal, so is (^\，• • • ， Wt m ). 
Thus, JVt is centered normal and EWJV t = s A i since for, say, s ^ t y 

EW.W t = EPFoWo. + D = EW 9 2 = s = s At. 

Conversely, let Wt be centered normal with EW $ fVt = s A f* Then, 
all Wt are normal with EfV t = 0 and EW t 2 = t so that £,(fV t ) = 91(0, /)• 
Since laws of finite sections are normal so are those of finite sets of incre¬ 
ments on disjoint intervals. But such increments, say ， 

are orthogonal, for 

E{W t - W M ) {Wt - /F；) = / + j J - / = 0. 

Therefore they are independent, and the proof is terminated. 
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We establish now, in a rather elementary way, a few quite useful prop¬ 
erties of Brownian distributed r.f. fV T . Unless otherwise stated, from 
now on, we set (B< = (& {fV ny s ^ t). 

a. Brownian martingales property. If a r.f. Wt is Brownian dis¬ 
tributed then 

(i) {W ty (S> iy / ^ 0) is a martingale 
and 

(ii) {W t — t\ (R h / ^ 0) is a martingale 
or y equivalently y under (i )， a.s. 

(in E{W 8i 2 \ ® a ) = / — 心 o^s <t< a,. 

Proof• The equivalence assertion is immediate: Let (i) hold. Then, 
for s < t y a.s, 

E{WJVt\ (B.) = W B E{W t \ (S> 8 ) = W 8 2 

so that, a.s. 

E{{W t - W s ) 2 - (t-s)\(S> B ) = E{W t 2 - t\ (B 8 ) - {W 8 2 - s). 

When (ii) holds then the right side vanishes a.s. and when (ii') holds then 
the left side vanishes a.s. The equivalence assertion is proved. 

Let /Fr be Brownian distributed so that its increments W at on disjoint 
intervals are independent and normal 91(0, / — j). Then (i) holds since 
a.s. 

E{W t \ (B s ) = E{W Bi \ ® 8 ) = EfFst 

and (ii') holds since a.s. 

E{W^\ (B s ) = EPF $i 2 = t 一 s. 

The proof is terminated. 

Throughout this chapter, we set 

Mt = sup fF» y m t = inf W M . 

B. Brownian extrema on (0 ， ①）. If a r.f. Wt is Brownian dis¬ 
tributed then Moo = lim M t and = lim m t exist，and a.s. 

t — » 00 t — ► 00 

(i) m t < 0 < M t on (0, oo ] y 

(ii) 一 oo = = liminf Wt < limsup Wt = M ① = + ① • 

t — ► 00 t — ► <0 




240 


BROWNIAN MOTION AND LIMIT DISTRIBUTIONS [Sec. 41] 


Proof. Since M t and m t are monotone in /, M ① and m ① exist and as¬ 
sertion (ii) has meaning. 

1 °. We prove (i). Since M t does not decrease when t increases，it 
suffices to examine its behavior near t = 0. Let / n 丄 0 and note that, 
by symmetry, 

PA n = P{W t > 0) = 1/2 

hence 

^(limsup A n ) ^ limsup PA n = 1/2. 

But limsup A n is a tail event on the sequence {W in — JV in _^) of inde¬ 
pendent r.v.’s ， hence, by Kolmogorov zero-one law，its pr. is either 0 or 1- 
It follows that P(limsup A n ) = 1. Since 

limsup A n = \J ^ 。 > 0 i.o.] C [ 抓〉 0 for all / 〉 0]， 

the last event has pr. L By Brownian symmetry, the same is true for the 
event [m t < 0 for all t > 0], and (i) is proved. 

2 °. We prove (ii). By symmetry，liminf W t = — 00 is implied by 

t 一 ¥ OO 

limsup ^ = + oo and, since 

t ¥ 00 

Moo = sup Wt ^ limsup h limsup W 

l > 00 t-~¥ 00 

it suffices to show that limsup = + 00 a.s. But，given m ， 

P(JV n > tn) = I e^ v2f2 dv — > 1/2 

V2t m/V7T 

hence 

P{W n > m i.o.) = P(limsup[/F n > m]) ^ limsup P{W n > m) = 1/2. 

n 

Therefore，letting w — 

p = P(limsup W n = + 00 ) ^ 1/2. 

Since the sequence (D is that of successive sums of iid r.v.’s W n - 
JVn^u the Hewitt Savage zero-one law applies to the exchangeable event 
[limsup W n = +oo] so that p = 0 or 1. But p ^ 1/2 hencep = U and 

(ii) is proved. 

So far our definition and properties were in terms of Brownian law 
only. In fact, the most frequent definition of “Brownian motion” con¬ 
tains one basic supplementary condition: Wt ^ sample continuous，that 
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is, its sample functions (^V(w)，w C u) are continuous. The definition 
is then followed by a laborious existence proof. On the other hand, we 
know that in order to proceed to sample Analysis we need separability: 
As usual, our r.f.'s are> or are selected to be 、 separable. A Brownian dis¬ 
tributed r.f” determined by the preceding Brownian covariance criterion 
A exists because of the Daniell-Kolmogorov consistency theorem 4.3A. 
Furthermore, we know from 38.2 that every r.f. has equivalent separable 
versions 一 with the same law. 

Clearly, sample continuity implies separability and we shall show that, 
conversely, we can choose separable versions of Brownian distributed 
r.f.’s which are sample continuous. In fact, this results at once from 
40.3B(9l) which was obtained by a technically involved analysis of sepa¬ 
rable decomposable processes. Here we shall give a direct and rela¬ 
tively elementary proof by means of some properties of independent 
usefulness. 

First note that a Brownian distributed r.f. is continuous in law 
since, as ^ 

Ee iuWs = r ⑽ 2/2 — 厂 〜 2/2 = Ee iuWt . 


Moreover, it is continuous in q.m. hence in pr. since, as s — > /, 

E\ — fV1 1 ** = I i — / 1 -^ 0. 

But when a r.f. fV T is continuous in pr. we can take for the separating set 
any countable set dense in T, such as the subset of all rationals or of all 
dyadic nmubers; we shall do it frequently. 

Remark. Note that，by 40.3a, a separable Brownian distributed r.f. 
is also continuous a.s. and all these types of continuity are uniform on 


every finite subinterval of T. 

b. Normal approximation lemma. 



dv tben y as 


u ― > 00 j 

%〜 e 一叫 a 


and、more precisely 、 for a > 0 y 


e ^ n ! a e^/a. 

Proof • Upon integrating by parts we have, for a > 0 y 

(1) %= f - -d{e^) = e 一邦 I a - 

where, for a —> 00 , 
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⑵ f - ^ %/a 2 = o(7a). 

^ a ^ 

It follows that, as ^ ^ 00 ， 

〜 e^ a2n /a. 

Also (1) and (2) yield, for a > 0 y 

e -邮 j a e^/a. 

The proof is terminated. 

The following inequalities are trivial extensions of P. Levy inequalities 
18.1C: the same argument applies. We repeat it. 

c. Let be independent r.vJs and set Uo = F n = 0 

Uk = 专 1 + * • • + 匕 ，^ = 匕 +i + • • • + 匕，是 = 0, • • •， 

If P(F k ^ b)^ p for all k y then 

/>P(max Uk > a) ^ P(U n >« + 々 ）• 

k 

If P{\ Vk \ ^ q for all k，then 

夕 P(max j t/* I > ^ = ^(1 U n I 〉 a). 

Proof. Upon setting 

An = [t/。S «， • • • ， U k —i ^ a y U k > a] with z/。= [t/o > a] 

and 

B k = [Vh ^： 

the first hypothesis yields the asserted inequality by 

P(U n > a + P 厶 

k 

=Z P^kPBk ^pYiPAk^ pP(max U k > a). 

k k 

Since \ U n \ ^ \ U k \ - \ ^k\ y upon setting 

Ah = [| I 刍 a + 々 ，•••，I t/fc-i I S “ + 々， 

I f/fc 丨 > a + 々 】 with z/o = [1 〉 a + 々 ] 


and 
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'" L .」--■•■ - >•. ， ■■_ - - ■- - ■ r I 1 U-I ■ _ _ 一 _ 

Bk = [\ Vk\ = b\y 

the second hypothesis yields the asserted inequality by 
尸 (I U n \> a) ^kBk = YL PA k PBk ^ ^P(max| Uk \ > a b). 

k k 

d - Extrema equalities ， If IV T is a separable Brownian distributed 
r./. then、Jor a g 0 ， 

P{M t > a) ^ lP{Wt > < -a) ^ 2P{lV t < -a), 

P( sup I PF, \> a) ^ 2P(| W t I > a). 

O^s^i 

Proof. It suffices to prove the first inequality. For, the second foL 
lows by Brownian symmetry and the third one results by adding them. 
Because of separability and continuity in pr” it suffices to show that 
the first inequality holds for s varying over a countable set {s u 心 ，… } 
dense in [0, /]; we can and do include therein 々 = 0 and s 2 = t. 

Let 0 = /o 〈•… < / n = / be the first n terms of this set reordered by 
increasing values. Since increments on disjoint intervals are inde ， 
pendent 

w tk = ^0,<, + ... + /F t k _ v , l；y wt n - w tk = w tkJk+l + ... + W lk _ v t ki 

and the laws of JV tn — W tk are symmetric hence have zero medians, c 
applies with b = 0 and p = 1/2. Thus, 

P o (|up W ik > a) ^ 2P{W t > a) 

and, letting n — ① y the asserted inequality follows. 

C. Brownian sample continuity. Almost all sample functions of a 
separable Brownian distributed r.f. Wrare continuous. More precisely、for 
every p < 1/2, there is a r.v. v p such that n > v p implies that a.s. 

\W 8 — W t \ < 3/« p for I j — /1 < l/n y j, / C [0, n\. . 

Proof. The continuity assertion results from the last one since then 
almost all sample functions are uniformly continuous on every finite 
interval. 

Thus, let 

Yn = sup \w 8 - c. [0 y n] 

U — <l/n 

and set 

Z k = sup I W t - \ y J k = [(k- l)/n y k/n] y 走 = 1，…， ” 2 . 
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By triangle inequality 

Y n ^ 3 max Zk. 

k 

Since the Zk are iid r.v.’s，for every given e > 0 ， 

p n = 尸 (max Za ； > e) = P U I Za ； > € I S Z) > ^) S n 2 P(Zi > e). 

k k 

But, by d, 

广 00 

P{Z X > e) s 2P(| W lin I > 6) = V2A I ^ ^ v2/2 dv 

^ ty/n 

so that, by b, 

p n S 2n 2 P{\ W\\ n \ > e) 〜 2V2/t n zn e^ nt2/2 /e. 

Upon taking e = e n = l/n p with p < 1/2, -it follows that YLpn < 00 . 
Therefore, by Borel-Cantelli lemma，there is a r.v. v p such that a.s. 
n > v p implies Y n ^ 3/” p on [0, ”]，and the proof is terminated. 

The a.s. sample continuity permits to complete a but we will not need 
the “if” assertion of the criterion below. 

D. Brownian martingales criterion. Wt is a separable Brownian 
distributed r.f. if and only if Wt is a.s. sample continuous and 


(0 

(Be, / ^ 0) is a martingale 

with 


(H) 

{Wt — t y (R h / ^ 0) is a martingale 

or with 


(iii) 

E{W 6 t = / — 心 0 ^ ^ < / < oo 


For, by a, the “only if” assertion results from Brownian law and the 
“if” assertion is 39.2B. 

We are now ready for the final 

Brownian motion definition. A r.f. Wt — iW / = 0) is said to be 
a Brownian motion 、 or Brownian for short, if 

(i) Wt is Brownian distributed, that is, it is centered normal with covar¬ 
iance defined by EJVJVt s A t or, equivalently, it is decomposable 

with & {Wt) = 91(0, /), and 
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(ii) dll sample functions of W T are continuous^ unbounded of both signs ， 

that is，liminf W t = — oo, limsup W t = + °°，and start at 0, that is, 

^ 00 4 00 

W, = 0. 

Restrictions of Brownian X T to subintervals of T will also be said to be 
Brownian. 

Such versions exist: Brownian distributed fV T exists. Take a sepa¬ 
rable version also denoted by JV T so that, outside a null event N u the 
sample functions are continuous. By B, outside a null event 

liminf W t = — oo ， limsup ^ + 00 

t ~® t > oo 

and ， say, by A } fV 0 = 0 outside a null event 

Thus, (i) and (ii) hold outside the null event N = N\^J N<i^J Nz. 
Then, either throw out N or，for all / ^ 0, set, say ， 

JV t = / 1/2 sin / on N. 

Because of sample continuity 

M t = max mt = min fV“ 

0<s^ 

and combining B with sample continuity we obtain at once 

e. Brownian sample functions have infinity of zeros in every neighbor¬ 
hood of t = 0 and of t = . 

Recall that a zero of a function W r(co) is a value of / such that 
Wt(o)) = 0. 

While continuous，Brownian sample functions are not differentiable 
a.s. at any fixed /， that is, 

f. The W r(co) are not differentiable at fixed t for co C where N t is a 
null event. 

For，with arbitrary a > 0, as 々 — 0 ， 

e^ v2/2 dv ^ 0. 

While Nt may depend upon /， we shall see in the next subsection that 
it may be replaced by N y so that almost all Brownian sample functions 
are “Hermitian monsters.” 

For the forthcoming “time inversion” we need 
g- If JVt is Brownian then 

tJVyjt ^0 as /—> 0, equivalently 、 as t— ①. 
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Proof. The equivalence is immediate: Change t into \/t in either 
assertion, and we prove the first one. 

Let JVt = tWin for / > 0 and = 0. It follows at once that 

EW^Wt = st(l/s y l/t) = s A t for s y t > 0 and also for j* = 0 or / = 0. 
Thus, Wt is Brownian distributed. Since Wt is sample continuous on 
[0 ， oo )， Wt is sample continuous, hence separable but on (0 ， oo). Take 
a separating set in (0， ①） and adjoin to it / = 0. This enlarged sepa¬ 
rating set yields a separable version on [0， ①） with the same Brownian 
law that we continue to denote by Wt’• But separable Brownian dis¬ 
tributed r.f. Wt is a.s. sample continuous on T = [0， ①） hence W / ^4* 
fFo = 0 y and the assertion is proved. 

By refining as above the separable Brownian distributed r.f” we ob¬ 
tain a Brownian motion that we shall still denote by 

= 0, Wt = tW xli ). 

The following transformations will be very useful : 

E. Brownian invariance theorem. If JVt = iWt y t 1 0) is Brown¬ 
ian then so are 


d - ^ 0) 

t ^0) y c > 0 

{JV^ = 0, Wl = tW l!h t ^ 0) 
{W u - W u — h O^t^u) 


symmetry 
origin change 
scale change 
time inversion 
time reversal 


For, clearly Brownian law (compute the covariances) and sample 
continuity hold as well as the start at 0, and unboundedness holds ex¬ 
cept for time reversal where it has no content. 


Convention. From now on y JVt — iW ^ denotes Brownian 
motion ， unless otherwise stated. However, we shall mention it in state¬ 
ments of propositions. 

41.3. Brownian sample oscillations. The preceding subsection was 
centered about the search for the most refined version of Brownian 
r.f/s by means of direct and rather elementary arguments. The basic 
achievement was sample continuity. We examine now this property in 
much more detail and describe the extremely irregular oscillations on 
intervals and locally in terms of Lipschitz conditions, establish sample 
nowhere differentiability and study sample variations on intervals. We 
shall use 41.2 without further comment. 

Let s y t belong to [0, /o]. 
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Kolmogorov a.s. sample continuity condition on [0, / 0 ] proved in the 
Corollary of 38.3B, is 

E\ JV S -- fV t \ a S Cq\s 1 1 1+6 , a y b y co positive. 

Since for separable Brownian distributed fV[o t t Q ] 

E\ W s — W t \ ln = c n \ s -- t | n , n = 1，2, • • • 

where c n = (2n) \/2 n n ! (see 13.4.1°), Brownian a.s. sample continuity ob- 
tains by taking n > l and / 0 as large as desired. Furthermore, we proved 
in the above proposition that, in fact, the Kolmogorov condition yields 
more than a.s. continuity: For every c > \ y there is a r.v. r c such that, 
for — /| < r c and 0 < p < bja 、 a.s* 

\W^W t \<c\s - t\\ 

Since a = and b = n — \ where n can be taken as large as desired， 
p ^ (n — l)/2n I 1/2 and it follows that for every c > l there is a r.v. 
r c such that, for | j — /1 < r c and p as close to 1/2 (from below) as de¬ 
sired 

I - w s I <c\t- s\ p . 

Thus, the first assertion below holds: 

a. Brownian Lipschitz ^-property. Almost all Brownian sample 
functions are Lip p for p < 1/2 but are not Lip x / 2) on any given interval 
[ 0 , to}. 

In fact, both a.s. sample continuity and a follow at once from the much 
more precise 

A. Brownian sample continuity modulus. If Wt is Brownian 
then 、 in any given interval [0, /o]， a.s. 

(1) limsup \ — JV t \ /V2| s — t\ log l/\ s — /1 = 1， 

equivalently ^ 

when c > l y there is a r.v. r c > 0 such that for | j 一 / | < r C) a.s. y 
(1，） \W,-Wt\ <c V\] 一 /|logl/| s-l\ 

when c < l y there are arbitrarily small values of \ s — t \ such that 、 a.s. 

(1") \IV 8 -W t \>c V2| i - t\ log 1/| j - 7 |. 

Proof. The equivalence assertion results from the definition of 
“limsup/’ and we prove (10 and (1"). 

To simplify the writing we take /o = 1. Let 
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h = I j — / I with 0 < 々 < 1 ， g(h) = ^/2 h log 1/ h. 

Since W 8 — fV t is normal 91(0, h) y we obtain 

q(h) = P(\ — Wt \ > eg{h) = ^/2 /t J e-’ 12 dv 

c>/2lool/h 

〜 h c2 /c\^T\ogl/h. 

When c > l y elementary computations show that all conditions of 38.2B 
hold, and (1’ ） obtains. 

When c < set 

A n k = \\ fVkh n ^{k-l)k n I = C S(^n)y 

h n = l/2 n , k = 1， • • • ， 2' 

The A nk are independent in k so that 


P \J Ank = i} - q{hn)) n 1 . 
k 

Thus, with pr. as close to 1 as desired, at least one of A U k occurs for n 
sufficiently large, that is, for h n sufficiently small, depending upon the 
sample function, and (1") obtains. The proof is terminated. 

While A describes Brownian sample oscillations in fixed finite intervals 
in terms of a uniform Lipschitz condition, a more precise one 一 as is to be 
expected，describes local sample oscillations, that is, in neighborhoods of 
fixed points in [0 ， oo]. As background to LIT, we prove 
b. If Wt is Brownian andg is a Borelfunction on (0, to (0, oo), then 

liminf JVt g(t) and limsup Wt/g{t) are degenerate 

t ~^0 ^ ~►O 


and 


For，setting 


^(lim Wt/g{i) exists) = 0or\. 
o 

= n = n s ^ u) y 

u >t u>t 


the assertions result at once from the fact that the limits therein are 
(Bo-h-measurable, by the Blumenthal zero-one law in the Brownian case: 
b 0 . The (x-field O degenerate. 

Proof • ffi 0 + C 队 for all / > 0 while, by Brownian decomposability for 
all / > j > 0, (S> t and (S>(W st ) y hence (B 0+ and “）， are independent. 
But, by Brownian sample continuity (at 0 )， 
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W st ^ Wt - 

It follows that ®o+ and (S> t are independent for all / > 0 hence (B 0 + and 

PI = ®o+ are independent, and the assertion obtains. 
t>o 

B. Laws of the iterated logarithm (LIT). Let W T = ^ 0) 

be Brownian. Then 、 a.s. 

Local LIT: For every fixed s ^ 0 

limsup (JV^t — W s ) /V2/loglogl It =1 
t ― >o 

liminf {W^ t — IV s ) / V2/loglog 1 It = —1 
o 

limsup| JV^t — W t \ /V2/loglog 1// = 1. 

Asymptotic LIT: 

limsup ^/v 7 2/loglog/ =1 
♦ 00 

liminf W t / V2/loglog/ = — 1 

00 

limsup \Wt\/ V2/loglog/ = 1 • 

t—* 00 

Proof• Local LIT reduces to those at j = 0 by Brownian origin 
change invariance. In turn, these ones and asymptotic LIT reduce to 
each other by Brownian time inversion invariance. Thus, it suffices to 
prove, say, local LIT at J = 0. But the third one obtains from the first 
two and the second one obtains from the first one by Brownian sym¬ 
metry. Thus, it remains only to show that a.s. 

(1) limsup W t / V2/loglogl// = 1. 

o ~ 

Let 

g(t) = V 2/loglogl// 

and let t n = q n with 0 < ^ < 1 to be selected suitably. 

We have to prove (1) or equivalently, that a.s. 

(1’) limsup W t /g{t) S c for every c > \ 

t 

and 

(1 ’’） limsup Wt!g{t) ^ c for every c < l. 

1°. To prove (I')，it suffices to show that P{A n i.o.) = 0 where 
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An = Wt > cg{t) for some / C 1/n+i ， “]] 

with given c > \ and some q G (0 ， 1). 

Since g{t) decreases as / decreases, 

An C [Mt n > ^(/ n+ i)]. 

But, for every e > 0 ， 

P(M tn > tVT n ^ 2P{W tn > eVT n ) ① 

= Vyir J e_ v212 dv ^ V2 /t e^ 2/2 /e 

so that, taking 

€ = 6 n = cg{t n ^^/^Jn = log {(” + 1) log \/q\ y 

it follows that 

Pn = PA n ^ P(M n > 〜 a/(n + l” 2 Vlog (« + 1) 

where a = a(c y q) is independent of n. 

Since r > 1， we can select q such that l/c 2 < q < l so that c 2 q < 1. 
Then YiPn < 00 hence, by the Borel-Cantelli lemma, P{A n i.o.) = 0, 
and (1') is proved. 

2°. To prove (1")，it suffices to show that 

P{W tn > Cg(t n ) i.O.) = 1 
for given c < \ and some q C (0 ， 1). 

Since the increments Y n = W tn — W ^ n _ 1 are normal 91(0, 4 一 4 一 i)，for 
every € > 0 ， 

广 00 

P(Y n > eVtn - /n-i) = e- 邱 dv 〜：^ e-^/e. 

Let 

C = c n = C\g{tr)/^/tn = CiV2 log {tl \og \ Jq) 
with 0 < Ci < c(< 1). Then 

p n = P(Y n > c^l - q) l/2 g(tn)) ^ 

where a\ = ai(ci y q) is independent of n. Since c x 2 < 1 we have YLpn = 
oo and, the Y n being independent, the Borel zero-one law applies, hence 


P(Y n > ri(l - q) 1/2 g(tn) i.o.) = 1. 
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On the other hand, by Brownian symmetry, (T) with, say, r = 2, 
yields a.s. 

■ 4. 

? ； 

— 妒 t n+l < 2g(t n ) i.o. hence W in ^ x ^ 二黑 ’; +1 ) g{t n ) i.o. 

It follows that a.s. 

\ 

= Y n fV <n+l ^ (ri(l — q) 112 — 2^(/ n 4.i)/^(/„))^(/ n ) i.o., 

where 


hence, a.s., 

But 


g(tn^i)/g(in) ^ q ll \ 

U (ci(l - q) 112 - ^q ll2 )g(tn)lo. 


(i(l — 穿 )” 2 — Ag 1 ’ 2 — > C\ < c dis q — — 0 
hence we can select ^ C (0, 1) sufficiently small so that a.s. 

^ 々 On) i.O” 

and (1") is proved. 

In preparation for the a.s. sample nowhere differentiability, we remind 
the reader that a real-valued function on, say, [0, ①） is differentiable at 
t if and only if the four Dini derivatives at / have common finite value. 
The right ones are defined by 


■ = limsup Sit + h l _ Sit) 輔 


Hminf 拉士 

0 <h—0 h 


and the left ones obtain upon replacing A > 0 by A < 0. If g is con¬ 
tinuous then we can let A — 0 along rationals. 

Recall that a property holds a.s. if it holds outside some null event; 
what happens on this “exceptional” event is irrelevant. 

The ingenious proof below is due to Dvoretsky, Erdos and Kakutani. 

C, Brownian sample nowhere differentiability. Almost all 
Brownian sample functions W T {^) = iJV <(co), / ^ 0) are nowhere dif¬ 
ferentiable. 

Proof. It suffices to show that the assertion holds on every finite in-- 
terval J d T\ to simplify the notation we take J = [0, !]• 


Let 
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A = [D^JV t and D^W t are finite for some/C [0,1]]. 

Then 

00 00 

^ = U U Amn 

m*»l n 爾 1 

where 

A mn = [| fV t +h — Wt \ < mh for all h C [0, l/n] for some / C [0, 1]]. 

It suffices to prove that A is contained in a null event or, equivalently, 
that so is every A mn . 

On yf mn , if / C [(是 一 1)/”， 是 /”] then, for n ^ 4w, 

\ t^kin — JVt \ < w /riy I W (jk-fD/n Wt \ <. 2m/n y 

I W (A ； +2)/n — w t \ < 3m/«, t -w t \< ^m/n. 

It follows, by triangle inequality, that 

oo n 

Amn Q B = PI B n with 5 n = U B nky 

n_4m k 爾 1 

where 

Bnk = [| ^fc+l)/n — I < 3m/«, | W (jfe + 2)/n 一 ^ (fc+1)/n | < 5m/ 衫， 

^ (Jfe+3"n — ^ (Jk+2)/n I < 7^/ tl\. 

But the Brownian increments on the B n k are independent and normal 
31(0, l/n). Therefore, for n ^ 4m, 

PBnk ^ 3.5.7m 3 /« 3/2 

so that 

PB — PB n ^ nPB nk ^ 3.5.7m 3 /« 1/2 —0, 

and the assertion is proved. 

Corollary. Almost all Brownian sample functions are not monotone 
and not of bounded variation on nondegenerate bounded intervals J. 

For continuous functions on [0, ①） are bounded on /, so that if these 
functions were monotone they would be of bounded variation on J and 
such functions are 入 -a.e. differentiable (where 入 is the Lebesgue meas¬ 
ure), contradicting sample nowhere differentiability. 
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Contrary to this Corollary, since E(^fV t ) 2 = A/, we may expect that 
the ‘‘quadratic variation’’ on an interval /, to be defined, would be its 
length I / I: Let J = [0, /] and, for k = 1， •• •丄， let 

^nk = W ink - lV intk _ iy S nk = t nk - t n ， Jc-l 

correspond to subdivisions 

D n = {0 = / n0 < •— < tnk n = t\ 

with 

D n C and max s nk —> 0. 

k 

We say that the quadratic variation of fV[ 0 j] is 

I {dWS = a.s. lim L Wj. 

D. Brownian quadratic variation. If PF[ 0tt ] is Brownian then its 
quadratic variation J {dWt) 2 = /, in fact 、 

/^n = L Wj ^t. 

k q.m. 

Proof. The Y n k = ^nk! are iid in 々 =I,---, k n with common 
law 91(0, 1), so that 

£(Yj -1)=0, EYj = 3 

hence 

E{YJ - l) 2 = 2. 

Since the Y n k 2 — 1 are independent in k and / = ^ s nk y it follows that 

k 

E(F n — /) 2 = E{Y n k — l) 2 = 2s n k S 2(max JnA：) / —>0 

k k 

hence 

V n q ^L 

To complete the proof，it suffices to show that the V n form a martingale 
reversed sequence. For then, by 32A(ii), V n / implies V n /. 
Without loss of generality, we can take 々 n = « by adding subdivisions 
if necessary. Then, for every n there is a 是 $ « such that 
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(1) Vn = Vn-,1 + 2U n 4*1,Jk 4*1 

Let 


and 


6 n = + ^n+2 ， ...) 


2)n = tyf = + I ^s — ^^ ,n + l ， Jk). 

Since c £> n conditions symmetrically the symmetrically distributed r_f. 

ty tn-\-\ y k = ^ 4* 1 4*l) > a.S -， 

E(JV n + l，Jfe W n + ljk + l 丨 2) n ) = W n 4-1,4 ： E{W n-fl.Ar-fl I = ^ 

so that, conditioning (1) by Q n C )山， a.s. 

E{V n \ V n ^V n + 2 , •… )=Vn + 1 ， 

and the proof is completed. 

Note that the preceding Corollary to C follows also from D : 

For, Brownian sample functions being uniformly continuous on [0，/]， 
max) fV n k I — 0 so that, if IV t) were a.s. of bounded variation for 

k ， 

/ > 0 then a.s. 

0 < /— $ max| m |U — 0. 

k k 


41.4. Brownian times and functionals. In what follows, W T = 
(JVt y / ^ 0) is a Brownian motion, s and t with or without affixes belong 
to T = [0, o°), and 

(B« = <S>{W^ s ^ /), (Br — ((Bf, / ^ 0); (S>^ — V (B«. 

A (Br-time (or will be called a “Brownian time” and we recall 

the definition and various properties of such “times” in 38.4 that we 
shall use without further comment. 

A measurable function r on Q to [0, ① )is a Brownian time if [r ^ /] C 
(B« for every /, and then so are all r A / and r + /. Every / is a degenerate 
Brownian time. 

To Brownian times r are associated o*-fields 

= {B^ C ®oo : ^oo [t ^ /] C (S>t for all / ^ 0}, 

with respect to which they are measurable. If a- ^ r are Brownian then 
®<r C Since our fVr is sample continuous, for every finite Brownian 
r, the W r defined by tV r {^) = W r ^ (w) are (B r -measurable r.v/s. Thus, 
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we write “(B T = (St>{JVt y t ^ r) yy and speak about events B r C ® T as "de¬ 
fined on {JVt y / ^ r) M or “on up to time t” (included). 

Extremely useful Brownian times are finite elementary ones: 

^ = L tjI Bp /i < / 2 < . . . , Bj C 

3 

to which correspond r.v/s 

W r = 

7 J 


For, given a finite Brownian time r, the elementary ones 


00 


n 


S kh n I[(k-l)h n ^r<kk n ]y h n = l/2 n , 


k 雄 1 


are such that 


n 


® T/t 3 (B r 


and, by sample (right) continuity, the r.v/s 


00 


r n = kh n ^[ (k—l)h tl ^ 


W T . 


Among the most important Brownian times are first exit times ru 
from open sets U (or first hitting times of closed sets U c ): ru(<^) is defined 
to be the infimum of all those / for which the distance of the sets {fV 8 (o )) : 
j $ /} and U c is zero; if such ru(o)) does not exist, we set ru(o)) =-J- 00 . 
The tu are Brownian times. For, setting S n — {x: d[x y U c ) < l/«} and 
taking all rationals r < /, 

[ Tc/ ^ /] = [IV t C m U (fl U [Wr C ^n] C 

n r<t 

We shall be using two such times and, to simplify the writing, set 

T at b = rc/when U = (a y b) with a < 0 < b y 

r c = ru when U is the complement of {r} • 

Since Brownian motion starts at 0 and is sample continuous and un¬ 
bounded (of both signs), r at b and t c are finite. 

When r = r a M we can describe completely & (fV r ) and find Er in a 
rather elementary way. We require a very simple case of 39.2A that we 
prove directly. 
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a. Martingale lemma. Let {X 9y s y 0 ^ s ^ t) be a sample rightcon- 
ttnuous martingale. If Y t = sup | Xt | is integrable then EX 9 = EXo 

or every time a t) of this martingale. 

Proof. Let h n = l/2 n , k = 1， • • • ， 2' and set 


so that 


Bnk — [(^ - \)hnt < a* ^ khnt] 
= L kh ntI Bnk i <y 

k ' 


and 



HX kK tI Bnk ^X 9 . 

k 


By martingale property, kh n t S t implies that 


Since all 



X 9n I ^ Y t integrable, 

the dominated convergence theorem applies so that 

EXo = EX, n — EX” 

and the lemma is proved. 
h. If r is a Brownian time then y for every t 、 

(1) EW TtKt = 0, (2) EPF r J = £(r A /). 

For, by the Brownian martingales property 41*2a, {W (B«, 0 ^ ^ ^ 
and {Wt ^ 0 ^ j ^ /) are martingales while, 41.2a implies that 

sup \ JV t \ and sup are integrable, so that the above lemma with a = 

o 心 9 

r A / applies, and EfV T ^ t = 0 

EWrM 一 T 八 / = EW 》-0 = 0. 

A. 1/ t — T at b then W r takes either value a or value b 、 

⑴ P( ^ r = .) = _^, P(/F r = ^)= 


and 
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(2) Er = EPVr = I cl \b. 

Proof. We use the above proposition without further comment. 
Since Brownian sample functions start at 0 G b) and are unbounded 
of both signs, r is finite and tV r {oi) leaves (a y b) either at ^ or at ^ with pr. 
p and Up ， respectively. Therefore, 

I R I ^ c = I ^ I + ^ on [r > /] 

so that, as / — ① ， 

f \W t \^ cP{r > /) — 0 

[r>il 

and, by the dominated convergence theorem, letting t — ① along in¬ 
tegers, 

0 = EW rht = I [rSi] W r + I [r>t] W t ^ EW r . 

Thus, ap + b{\ 一 p) = 0 and (i) obtains. 

The same argument applied to W t 2 in lieu of W t yields 

Er T E(r A/) = EW tM 2 = /[ 义】％ 2 + 7 [t>(] EW t \ 

and (2) obtains using (1). The proof is terminated. 

By proving first that r a , 6 is integrable, the preceding proposition would 
result from the much more general 

A' Brownian Wald relations. If r is an integrable Brownian time 
then 

EW r = 0 and EW r 2 = Er < oo. 

Proof. According to b 

(1) EW rhi = 0 (2) EW r ^ 1 - E{r M). 

Since E{r At) ^ Er < <», (2) yields (3) sup EWjm 1 < 00 . 

t 

Thus, letting / take integer values, the W T are uniformly integrable 
hence we can interchange ‘‘lim’’ and in (1) and, applying the mono- 

t oo 

tone convergence theorem, the first relation obtains by 

EW r ^ EW r = 0. 

As for the second relation, since the W t form a martingale, by (3) 
and 39.1C, 

EsupfV TA ^ ^ 4sup/^ rA( 2 < oo 

t t 
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so that, by (2), applying the dominated and monotone convergence 
theorem, as / —> ① along integers, 

EW，— EW t a t 2 = E(rAt) T < oo, 
and the second relation obtains. 

Perhaps the most important general Brownian property is the one* 
proved below. It was used systematically by P. Levy and rigorously 
proved by Hunt and by Blumenthal. It will be applied without further 
comment in the remainder of this subsection, especially with r = r r . 

B. Brownian times origin invariance. Brownian motion starts 
anew at every finite Brownian time ： If r is time of Brownian Wt — 

/ ^ 0) then the r.f. 

tV T r = W t —t — tV T = {W T ^ t — tVt ^ 0) 

is Brownian and independent of W [o tT ] = h ^ = r )* 

Proof. Clearly, tV T \ like IV starts at 0 and its sample functions are 
continuous and unbounded of both signs. Thus, it remains to prove that 
Wr r is Brownian distributed and independent of W [o, r ]. 

Let {JV tv • • . ， and {Wtl ，. . . ， W t ^) be arbitrary finite sections 

oi tV T and Wt • Let ^i, • •. ， be arbitrary, real- or complex-valued, 
bounded continuous functions on R y and let an arbitrary event B G 
(B r = (X h t S r). It suffices to prove that 

(C) E{I B g(fF tl T ) X • • • X giWtrl)\ = PBE{ gl (fV tl ) X • . • X 

For, say, take gj(a) = e iu J a so that gj{tVt) — e luWt y gjiJVt ) = e tuWt and 
recall that mutually consistent ch. f/s of finite sections determine con¬ 
sistent distributions of these sections which, in turn, determine the dis¬ 
tribution of the corresponding r.f/s. Or, take 心 ( 这 ) = /[。， 〜 •) so that, say, 


gWt) = 

: I[w t <a\ and proceed from there. 

Let 

GO n 

T n = H kh n I[{k^\)h n <r<kh n ]y h n = 1/2 ， 

so that 


⑴ 

Tn 1 T and Xr n +< — Xr+t 

and 


(2) 

B[r n — khn] €1 (S>kh n — = khn) 
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As usual, we first prove (C n ), that is, (C) with r n in lieu of r. Then, 
letting « — 00 , (C) obtains by (1) and the dominated convergence theo¬ 
rem. But, by (2) and Brownian decomposability and invariance under 
change of origin to kh n 、 

E{In\ Ttl ^kh n ] — X … X gm{fV(ch n ^t m — fVkhM = 

P(B[r n = kh n ])E{g,{W tx x ... Xg m {fV tm )\ 

so that, summing over 走， (C„) obtains and the proof is completed. 

This powerful proposition makes the properties oiWr and of Wt the 
same. For example, 


Corollary. Let t be a finite Brownian time. Then 

(i) Almost surely 

limsup(W T+< — /^ r )/V2/loglogT// =1 

and 

liminfd+< — tV T )i V 2/loglogl // = — 1 
(i ’） Almost surely 

limsup(/^ T+ , - /^ r )/V2/loglog =1 

t - ~> 00 

and 

- JV T )/V2t\og\ogt = -1 

t 一 ► GO 

(ii) Almost all sample functions W r r (co) have an infinity of values of t 
in every neighborhood of r(w) such that W r +<(co) = W r (co). 

(ii') Almost all sample functions W t t {^) have an infinity of values in 
every neighborhood of t — ^ such that W r+ «(a)) = W r (w). 

Note that, by sample continuity, (i) implies (ii) and (i’）implies (ii’). 

C. Brownian level sets structure. For almost all co, the c-level 
sets 

A (co) = {/: W t (o)) } = c 
are perfect、unbounded and }^nulL 

By definition, a set is perfect if it is closed and dense in itself (hence un¬ 
countable). Recall that X is Lebesgue measure and r c = {t: fVt = c) is 
a finite Brownian time. 
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Proof. It suffices to prove the assertion for the “zero” level sets A 0 (co) 
since, for r = r 0 

A c (w) = {/： Wr e ^t(co) = W^ / C (w) = 0}. 

The Ao(w) are closed since fVr(co) are continuous, and they are un¬ 
bounded by Corollary (ii’). For almost all w, 入 (Ao(co)) = 0 since Wt 
is Borelian hence (X X P)-measurable and P{W t = 0) for every /, so 
that, by Fubini theorem and P{JV t = 0) for every /, 

E(\(A 0 )) = E f I [w ^ 0 ) dt= \ P(fV t = 0) di = 0. 

It remains to prove that almost all Ao(co) are dense in themselves. For 
every rational r ^ 0, let 

r(r) = inf{/ r: PV t = 0}. 

Clearly, r(r) is a finite Brownian time and W T {r) = 0. By Corollary 、 
(ii), fV(0) = 0 is a.s. rightlimit of zeros of Wt 、 that is, 0 is limit point of 

Ao(co) for almost all co. The same is true with r(r) for Brownian fV T r(r) = 

0. Thus, if A{r) is the set of those w for which r(r) is limit point of 
Ao(w) then P(f\ A{r)) — L But w C H /(r) if and only if A 0 (w) is 

r r 

dense in itself, and the proof is completed. 

The following application of B is an extremely useful and a far-reach¬ 
ing generalization of the “symmetry” or “reflection principle” of Desire 
Andre for the ballot problem studied in Intuitive Background CDIII. 

D. Reflection principle. Brownian motion reflected at a finite 
Brownian time r is Brownian: If Wt is Brownian so is p T fV t defined by 

p T Wt = W t or 2W r — PVt according as t S r or t > r. 

Proof, Clearly, p r W t starts at 0 and is sample continuous and un¬ 
bounded of both signs. It remains to prove that it is Brownian distrib¬ 
uted. 

Let Yt = IV t for / ^ r and Yt — W T for / > r bt fVr “stopped at t’’ 

and, as before, let JV T r = W t +t — W r be JVt “starting anew at r/ 

Since r and Y r are ® T -measurable，the Brownian Wt t is independent of 
(r, Yt) and, by symmetry, so is —Wt • Therefore setting 


and 


U t = V t = W t iox t ^ r 
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U t = W r + w t 一 : 、 V t = w r - W t J for/ ^ r, 

Ut and Vt have the same distribution. But, for / ^ r, 

W t ； = w t - W T 

l — T 

so that 

Ut = W T + - W T ) = w t 

and 

V t = W T - {Wt - Wr) = 2W r - = PrWt. 

Thus, 

U T = W T , Vt = PrW T , 

and the asserted principle obtains. 

In the remainder of this subsection we derive distributions of various 
Brownian functionals using without further comment B and C with r = 
r c — the c-level reflection : The changes oi Wt after r-level time r c are in¬ 
dependent of its changes before r c and their pr.’s are not changed by 
reflecting Wt ^ r-level. 

First we transform extrema inequalities 41.2d into 
c. Extrema equalities. For c > 0 

广 00 

P(M t > c) = P{+m t < —c) = P(\ fV t \ > c) = V2/-7T I e~ v212 dv. 
For, 

P(M t > c) = P{M t > c y Wt> c) + P(M t > = c) 

+ P(M t > c 、 W t < c 、 

where 

P{M t > c ， W t > c) S PiW < = ()=0 ， 

P(M t > c y W t > c) = ?{Wt > c) 

and, by r-level reflection, 

p Tc Wt = W t ox 1c — W t according as / ^ r c or / > r c 

hence, setting p Jc Mt = max p 1V 8 、 

"* C 
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P(Mt > c y JVt < c) = P(p Tc Mt > 2c — fVt < c) 


P(M t > c,fV t > c) = P{W t > c). 


More generally 


E. Distribution of (M h fV t ). Let c > 0 and J = [ri, c 2 ]. Then 


1 广 v c 

2irt ^civc 


•u 2 /2< 


dv 


V2tt/ 



(2c — ci)v« 


dv. 


(2c-c 2 ) Vc 


P(M t >c,W t CJ)=^= t 

Proof. Since P(JV t = 0), 

P(Mt > C, IV t cj) = P(fVt G J[c y ① ））+ P(M t > r, % e (- 00〆])• 

Let pi and p 2 be the first and the second right side terms, respectively. 
Thus, 


Pi 


Vliri 


^C2V C 
J c\yc 


e_ v2l2t dv 


while, using P(M t = r) = 0, 


p 2 = P(p c M t > c y p Wt C. J ^ (— 00 〆]) 

= P(M t ^ c) y W t G [2c - r 2 , 2c - n] [c : 

/ (2c—ci)yc 

e _v”2t 

- W ^2c-C2)Vc 


)) 


where the last but one equality obtains by 

w t e [2c - r 2 , 2c - n] n k, «) C [Mt ^ c]. 

Note that E contains c : Take / = (— ^+①）. Also, taking c = x 
and =— ① 〆 2 = j ， E becomes 

P(M t > x y W t <y) ^ 2?{W t >2x -y) 

and yields 

Corollary 1. The joint distribution of (Mt y W t ) has pr. density 

p(^>y) = V2/V/ 2>V 厂之 exp{ — (2x — y) 2 /2t\ 

for x > 0 andy ^ x and is 0 otherwise. 

Upon setting j = — z so that z is a value of Mt — W Corollary 1 

yields 
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Corollary 2. The joint distribution of (M h M t — IV t ) has pr. density 

Pi( x > 2 ) = V2/irt exp{(x + z) 2 /2t\ 

for x > 0, 2 > 0 and is 0 otherwise. 

Since pi (x y a) is symmetric in x and 2 , upon taking c into account, we 
obtain 

Corollary 3. The r.v.*s M h — | fV t \ , M t — fV ty W t — m t have the 

same d.f. 

F(x) = V2/t( I e_ v2l2t dv 
for x > 0 and 0 otherwise. 

Yet, there is a glaring contrast between, on the one hand, Mt and 
— tht which are nondecreasing, positive on (0 ， 00 ) and vanish only at 
t = 0 and, on the other hand, the identically distributed \ Wt\^ Mt — Wt 
and Wt — tht which oscillate a.s. as / varies and have a.s. an infinity of 
zeros. 


§ 42. LIMIT DISTRIBUTIONS 

The far-reaching extension of limit laws of sums of random variables 
and of random vectors to convergence of laws of random functions and of 
their functionals began with Kolmogorov (1931). He was followed by 
Petrovsky and by Khintchine (1936). Their main concern was with 
events related to sequences of sums of independent or Markov dependent 
r.v/s remaining within given boundaries determined by continuously 
differentiable functions, and with the relevant heat equation and more 
general parabolic type ones. 

Some twenty years later, Erdos and Kac reopened the investigation 
of functional limit distributions in a new and direct probabilistic way. 
They brought to bear their “invariance principle” idea: To find limit 
distributions for functionals of random walks, say, |(0,1) with common 
expectation 0 and variance 1, do it first for simple random walks, say, 
the coin-tossing one which then remain valid or “invariant” for all 
⑽ 1). 

In 1952, Donsker obtained his crucial invariance principle for ?(0,1). 
An unpublished extension of Donslcer’s result is given by L. Le Cam in a 


264 


BROWNIAN MOTION AND LIMIT DISTRIBUTIONS [Sec. 42] 


note at the end of this chapter. His argument is direct and extends to 
(separable) Banach-valued r.v.’s. Within a few years, a flurry of con¬ 
tributions, especially those of Prohorov and Skorobod，state and solve 
the problem in full generality in terms of convergence of laws of r.f's 
and of their functionals. Prohorov (1936) builds the foundations (see 
§ 12) and, in particular, solves the problem for r.f/s which are sample 
continuous; Donsker’s principle becomes a special case. Simultaneously, 
Skorohod investigates the case of r.f/s whose sample functions are con¬ 
tinuous except for jumps. 

Five years later, Skorohod created a new approach, especially, for 
Brownian convergence he introduced his “Brownian embedding.” Soon 
thereafter, Strassen used it in an unexpected direction — for a.s. conver¬ 
gence, and obtained new laws of iterated logarithm. 

Besides the relevant works by the above-mentioned authors, the in¬ 
terested reader is referred to books rich in ideas and results, by Billings¬ 
ley, by Gikhman and Skorohod and, for the Skorohod approach, by 
Breiman and by Freedman. 

42.1. Pr.’s on e. Let 6 = 6 [0, 1] be the family of real-valued con¬ 
tinuous functions on [0, 1]. The t-th coordinate of its “point” x is the 
value x(t) of the function x at /, and Q is /inear: 

x y y C R 今 ax + iy C G- 

Since [0 ， 1] is compact, these functions are bounded and uniformly con¬ 
tinuous. Since uniform limits of continuous functions are continuous, it 
follows that, under the uniform norm 

Ikll = sup|x(/)| = max|x (/)|， 

t t 

the space Q with the corresponding metric d{x y y) = ||x — j|| is com¬ 
plete. Thus, Q becomes a Banach space. 

Furthermore, this space is separable. For, its members x can be approxi¬ 
mated uniformly by polygonal functions x n linear between ( 是 ——1) and 
k/n with values x{k/n) at the vertices ( 走 = 1， • • • > n) and these ones 
can be approximated uniformly by polygonal ones with rational values 
at the vertices, which form a countable subset of Q. 

The continuity modulus or b-oscillation y x (5) of x G 6 is defined by 

7x(5) = sup I *v(j) — x(/) I, 0 < 5 < 1. 

|s 一 4 <5 

a. y x (S) is uniformly continuous in x for every fixed 5. 
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For, triangle inequality yields 

7x(5) ^ II 太一 II + 7 1 /⑻ 
so that upon interchanging x and y y 

7x(8) — y y (8) I ^ 2\\ x — y \\. 

b. 7x(5) is non decreasing in b, and 7x(5) -^0 as 6—0 ，uniformly in 
x e K compact. 

For clearly, 7 X (5) is nondecreasing in 5, y x {h) — 0 as 5 — 0, and the 
Dini lemma for uniform convergence on compact K (Z G applies. 

Since the Arzela-Ascoli theorem is crucial for tightness criteria of pr/s 
on 6, we give its proof, using 2.3 and especially b and B therein without 
further comment. 

A. Arzela-Ascoli theorem. // C © has compact closure A if and only 

if 

(1) A is uniformly equicontinuous: sup y x (S) 一 0 似 3 — 0 
and 

(ii) A is uniformly bounded: sup sup | x{t) < oo 

" ^ t xCA 

or 

(ii 7 ) A is uniformly bounded at 0 ： sup | x(0) | < oo. 

" ^ xCA 

Proof. Clearly, (ii) implies (ii’ ） and, under (i), (ii’）implies (ii) since, 
by taking n sufficiently large so that sup y x {\/n) < ①， 

" " " xCA r ’ 

1^(/)I ^ I^(0)| + i^\(x(ki/n) — x((k — 1)//«)) 
and (ii) follows. 

If A is compact then it is totally bounded hence bounded and, a 
fortiori, so is A y that is, (ii) holds and (i) follows by b. 

Conversely, let (i) and (ii) hold. x n A then, by (ii) and use of the 
diagonal process, we obtain a subsequence (x n f ) which, on the set of all 
rationals r C [0, 1], converges to some x with x ^ A. But, by (i), for 
every e > 0 there is a 5 = 5(e) > 0 such that 卜 一 / | < d implies 
x n (s) — x n (t) I < € y n = 1 ， 2, • • • • Thus, for every t there is a rational 
r = r(€) such that 
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X n f (t) — I 

S I 々 W — ^n f (^) I + I *v w ， (r) 一 x m *(r) I + I *v m '(r) — x m ^{t) | 

^ 2e + I 々 (r) — x m f {r) \ 

and, letting m y n — ① then € — 0, ||^v — || — 0 hence x n . — x e 儿 

1(y n G ^ then there are x n 〔 A with d{x n ^ y n ) < \/n and x n ， — x ^ A 
so that y n ^ x ^ A hence A is compact. 

Projections of Q are defined by 

prt x . t m (x) = x(t u … ， /m)，C e ， 

and are continuous mappings of Q onto /?«, X • • • X Rt m ； 

Borel cylinders in G are defined by pr 一 1 q .with base A a 

Borel set in R tl X • • • X Rt m - 

c. The afield S of Borel sets in Q y generated by the class of open {closed) 
sets in Q y is generated by the class of Borel cylinders in Q y that is y S is the 
smallest <r-fieldfor which all projections are measurable. 

For, Q being separable, the open sets in it are countable unions of open 
spheres, and 

OO 

{x: ||x — xq\\ < r} = U fl {x: \x(r) — xo(r) < r — l/«} 

71=»1 r 

where r varies over all rationals in [0, 1]. 

In what follows, P with or without affixes denotes pr/s on S and 
the finite sections of P are restrictions Ppr - 1 tl , . . . . t m of P to Borel 

cylinders with bases in X • • • X Rt m \ they are pr/s on the class of 
such cylinders. We use constantly the concepts and results in Section 
12 . 

d(i). Weak convergence of sequences of pr>s on S entails weak con- 
vergence of their finite sections: For every finite subset (/i, •…， tm) of 
[0, 1], 

户 n A 戶 0 Pnpr_\ . t m 

(ii) If finite sections of pr.’s P n converge weakly then they converge 
weakly to finite sections of some pr. Pq. 

For, (i) is immediate and (ii) obtains by the Consistency theorem. 

While weak convergence of finite sections of the P n determines Po, it 
does not imply weak convergence of the P n to Pq ， For this we require 
moreover tightness of the P n * 
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We say that h is P^admissible on Q if A is a Po-a.e. continuous Borel 
function on (C, S) to a metric space with its cr-field of Borel sets. 

B. Weak convergence criterion. P n P Q if and only if 

(i) the finite sections of the P n converge weakly to those of Po and the 
sequence P n is tight or，equivalently y relatively compact 

or 

(ii) P n h - 1 A P^T X for every P^admissible h on 6. 

Proof. We prove (i): The equivalence assertion results from 12.3A 
since 6 is complete and separable. The “only if” assertion results from 
P n A Po implying relative compactness of the sequence P n and from 
d(i). The “if” assertion obtains as follows: Relative compactness 
means that every subsequence of the sequence P n contains a weakly 
convergent subsequence. But by finite sections hypothesis, d(\\) applies 
and the weak pr. limit Po is the same for all such subsequences hence 

P n A Po. 

As for assertion (ii), the “only if” part obtains by 12.1A Corollary 
and the “if” part obtains by taking h to be the identity function on Q; 
= x. 

For 6, we have the most useful 

C. Tightness criterion. The sequence (Pn) is tight if and only if 

(i) sup P n (x ： I x(0) I > c) -^0 as c <» 

n 

andy for every e > 0 y as 8 0, 

(ii) limsup P n (yx(^) > €) — 0 

n 

or 

(ii') sup P n (7x(5) > e) 0. 

Note that (i) says that the sequence (JPnpr^T 1 ) is tight. 

Proof. Since finite families of pr/s on 6 are tight, (ii) and (ii’）are 
equivalent. 

Let (P n ) be tight. Then, given e > 0 and t? > 0 there is a compact 
K QQ with P n K c < rj for all n y so that for c sufficiently large 

{•v: I x ⑼ I > C 


for d sufficiently small, 
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{•v: 7x(5) > e} C K\ 

and (i) and (ii) follow. 

Conversely, let (i) and (ii) hold and set 

A = {x: I x(0) I > c\ y B m = {x: y x (d m ) > l/m}. 

Then, given e > 0 and ?; > 0, we can choose c and h m so that, for all n y 

PnA < v/2 y P n B m < r,/2 m ^\ 

Thus, P n K c < 7j for all n y where the closure K oi A \J B m c is compact 

m 

by A, and the sequence (P n ) is tight. 

42.2. Limit distributions on e. Let ^ = (X(i) y 0 ^ ^ 1) be a sam¬ 

ple continuous r.f, so that X is a measurable function on some pr. space 
(A, ft, P) to (6, S). Finite sections of X are of the form • • • ， 

X{t m )). The law &{X) is given by the distribution Px on §> defined 

by 

Px(B) = P(XCB) y BC^. 

» 

Let X n = (X n (t) y 0 ^ ^ 1) be also sample continuous r.f/s. Con¬ 

vergence of /atvs jiiXn) to£(X) means that Px n A Px- We are concerned 
primarily with limit laws, that is, limit distributions of functionals 
h(X n ). By 42.1 B(ii), this problem reduces to 

A. Functional limit laws criterion. £(AT n ) £(X) if and only if 
<£(A(D) — £,(h(X)) for every Px-^dmissible h. 

By 42.1B(i) and c, the above criterion becomes 

B. Random functions laws convergence criterion. £(D — 
& {X) if and only if 

(i) the laws of finite sections of the r.j!s X n converge to those of the r,f. 
X: For every finite subset (/i, • • • ， A») V [0 ， 1], 

<Su(X n (tl)y • • • ， X n (t m )) > £>(X(t\) y • • • ， X(t m )) 

and 

(ii) the laws of the sequence {Xr) of r.fJs are tight: 

(iix) P(\ 兄⑼ I > r) — 0 似 c — ① 

andy for every e > 0, 

(ii 2 ) limsup P( sup I X n (s) - X n (t) | > €) — 0 似 3 — 0. 

n |s 一 
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From now on y W denotes Brownian motion on [0, 1] and we call Brown¬ 
ian convergence convergence of laws of r.f/s to & {W)\ s and / with or 
without affixes belong to [0, 1]. 


Let ^nk y 々 = 1， • • • ，是 rt — ① be independent r.v.’s with d.f.’s F n k y 
zero expectations, and finite variances <r n k 2 = <T n k 2 (X n k) = EX n k 2 such 
that 


k 


y max a n k 4 

k 


0. 


Set 


k 


k 


tnk 


E 


a 


nj 


<S n k = S ^njy tnO — 0, S 


nO 


0. 


Thus, 


0 = / n 0 = * • • ^ ^nk n = 1 > 


and we define the corresponding sample continuous r.f. X n = (X n (t) y 
0 ^ ^ 1) to be linear between t n k and t nt k^i with vertices (/ n jk, S n k) 

for 々 = () ， ••• ， k ny that is, 


Xn(t) = Snk 


t 


tnk 


专 n 


>灸 + 1， 


d [^nkf ’n.Aj+l). 


— t k 

We use the normal convergence criterion in 22.2 without comment to 
obtain its very far-reaching generalization: 

C. Brownian convergence criterion. Let r.v. y s ^ n k and r.jJs X n be 
as above and let W be Brownian on [0, 1], 

Then 


& (X n )^&(W) y infact St(h(X n )) ^&(h{W)) 
for every Pw- admissible h on Q 

if and only if Lindeberg condition holds: for every e > 0, 

kn r 

g n (^) =11 I a 2 dF nk {a) —> 0. 

A; = 1 ^ 

\a\ >« 

Note that Lindeberg condition implies max <r n k 2 ^ 0. 

w k 

Proof. For the “only if” assertion, take A on e to to be defined by 
h(x )= : x(l) so that h is continuous hence admissible, h(X n ) = S n k n 
and h{W) = IV(\) hence 91(0, 1) and Lindeberg condition 

holds. 
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It remains to prove the “if” assertion. By A, it suffices to show that 
£(X n ) &(JV). Only proof of B(i) and B(ii 2 ) is required since 足 (0) = 

fV(0) hence B(iii) holds. 

1°. We prove B(i). Let 

Xn(t) = E ^ 

so that, for every 

Xn{t) - Xn(t) I ^ max I ^nk 

k 

hence, for every € > 0 ， 

P(Xn(l) - X n (t) I ^ E P(\U I > e) 

K 

^ ] a 2 JFnk(x) S A(€)/€ 2 — 0. 

Thus, it suffices to prove B(i) for the Xn in lieu of the W n . But, the 
r.f/s Xn and W being decomposable and starting at 0, we have only to 

show that for all s y /, 

£(X/(/) — Xn(s)) ^ St{W{t) - W{s)). 

Since the Lindeberg condition holds for L 专咖 it holds a fortiori for 

fc 

Xn (() 一 Xn (s) = 

while 

T anj 2 — (/ 一 j) I S max a nk 2 ^ 0. 

S^<t fc 

Thus 

£(Xn(t) - Xn(s))^^L(0 y t- s) = £Wt - 
and B(i) is proved. 

2°. We prove B(ii). For | ^ — / | <5〆 and t belong to some interval 
of the form [ 々 5 ,(々 + 1)5] or to two adjacent such intervals. By triangle 

inequality, it follows that 

7 (5) — sup I Xi(/) — X n (s) I 

Xn \a-t\<6 

^ 2 SUP SUp I X n (J) — Xn(kd) I 

_ k kS<t<(k+2)6 
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^ 8 . sup E 

<>n.& 十 1 3 = i n Jt 

where 

jnk = maxjy: tnj ^ k8\. 

Since ， fovj nk < i < j n>k ^ ly 

limsup sup P ( L ^ > e/16) ^ 


( 16 / 我 


by the second part of 41.2c with ^ ^ = c/8, it follows that for 5 suf¬ 

ficiently small 

limsup P(y x (<t) > e) 


-~ r 6/€) 2 5 lim n SUP P \ X ^n^) - Xh 


But, by 1°, the right side "limsup” is 




VTZ 



e^ v2n dv 


and, by normal approximation 41.2b ， p(e y 8 )/d — 0 as 3 —► ()• Therefore ， 
as 5 — 0 ， 

limsup P(7x n (6) > €) = 0( E p{t, S)) = 0( 池 5)/5) — 0 

« k5 <t 

and B(ii) holds. 

Upon setting “* = ^ k /Vn, C yields 

D. Donsker invariance principle on e. Let \ = ( U 2> be a 
centered second-order random walk with common exp 0 and variance 1. 

Let X n be polygonal rj!s with vertices 、 kjn 、 Sn/^n) where A = ^ + 
••• + ^)^o = 0 and 々 = 0 ， 1 ， • • • ， ”• 


Then 


& (X n ) ^ £( 約 ， in fact £(A(X n )) — £(h(fV)) for every P w -admissibie h. 

423. Limit distributions; Brownian embedding. For Brownian con¬ 
vergence there is a direct approach based upon “Brownian embedding ’， 
of random walks into Brownian motion at Brownian times. 
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a. Brownian embedding lemma. For every r.y •专 with E 专 = 0 and 
0 < (r 2 ^ < oo there is a Brownian time r UtV such that 

^(^(r uv )) = £ ⑵， Er UtV = a% 

Proof. Let Brownian W = {JV{t) y t ^ 0) be on a pr. space (fi 0 , Po). 
Let {U y V) with U ^ 0 ^ Z 7 be a random vector on a pr. space (fli ，（ ii ， 
Pi) with d.f. G defined by 

dG(u y v) = ^ (y — u)dF^{u)dF^{v) y u ^ 0 ^ v y 


where 


a = > 0 

since = E 专 + — E^~ = 0 and 专 is not degenerate 
On the product pr. space 




(12, a, P) = (12 0 x i 2 i, a 0 x a b P 0 x Pi) 

W and (U y V) are independent, that is, ^ 0) and (B(t7, are 

independent <r-fields, and 


lV{t) t V)). 

is Brownian since 

[r" ] ^ / = [min W{s) ^ ]t/U [max 坏 XO S 〆]• 
If a> 0 then 

Pmr UfV ) > a) = EPm'y )〉 a\U，n 


with, by 41.4A 

P{fV{r u v ) > a\ U = u y F = v) = 0 or 


according 3.s v K a or v a. It follows that 

o +°° 

P(^( r u v) > “) = f f ITTj %(“ ， y )= 户(专 > 沒 ); 

' *^tLaas 一 CO VmsO 卩 一 " W 


similarly, if a < 0 then 

Pmr UtV ) <a) = < a). 


Thus 
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v )) = £ ⑴， 

by the same theorem, 

Et v.v = FXE(r r _ v )\ U, F)) = E\ UV\ = Emr v v )Y = 
and the lemma is proved. 

From now on, let 专 (0 ，1 ) denote second-order centered random walk of 
"W r.v's • • • with common expectation 0 and variance 1 and denote 

= 0 , • • • ，义 = 6 + • • • + I， • • • their successive sums. 

A, Browniant embedding theorem. Given !(0, 1 )， there is a Brown^ 
T.j.W — (^(/)> / ^ 0 ) and an independent of W sequence of iid Brown¬ 
ian times ri,r 2 , • * • with common expectation 1 such that 

(i) JV{r\ + r 2 ) — ^(ri), lV{r\ + r 2 + t ：0 — W{j\ + r 2 ), • • • 
are iid r.v's with common law £(D ，equivalently 

(ii) £>(fV(r\) y fV{r\ + To), fV(n + r*> + r ： 0> • • •) = £( 6 * 1 , S^ y 6 * 3 , • • •)• 

Proof. The equivalence assertion is immediate, and we prove (i). 

Let Brownian IV be on a pr. space (fi 。， （ 2 0 , 尸 0 ) and let (U u Vi), (C/ 2 , 心）， 
• • • be random vectors on pr. spaces ( 12 b a h Pi), (ft 2 , ® 2 , 尸 2 )，. • • with 
common d.f. G introduced in the above lemma. On the product space 

4 

⑼ a ， 尸 ） =(n 0 x x x … ， a 0 x % x a‘2 

X …， Po X I\ X P2 X …)， 

and the iid random vectors (U\ y V 1 ), ^ 2 )，• • • are independent. 

According to the lemma, there is a Brownian first exit time n = ru iy V i of 
W from (U\y V x ) such that 

£(fV(n)) = £(^ 1 ), Eti = E^i 2 = 1, 

fV {X) = + /) — JV{r\) y / ^ 0 ) is independent of {JV sy s ^ t u Ui y 

V 1 ) and, by Brownian times origin invariance 41.4B tV {x) is a Brownian 
r.f. independent of Proceeding similarly with first exit time r 2 

of fV {l) from (U 2 , V 2 ) and so on, we obtain iid r.v/s 

W{r x ), PV{r, + r 2 ) - ^(n), tV(r x + t 2 + r 3 ) - + r 2 )， … 

with common law & {lV{r\)) = £ (专 1 )，and the proof is concluded. 

b. Breiman lemma. Let ri ， r- 2 , . • • be the Brownian exit times in 
A. Then 


Y n = SUp 


Tl + • • • + r[nt] 

n 



Ao. 
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Proof. Let <Xk = (n — 1) + • • • +(r* — 1) where the summands 
have mean 0 since the means of the t’s are 1 and set dk = <^k/k. Then 

Tn = o S ^li 1 a ^ /n 1 + \ 

so that 


Tn - o $ Mi ^ 




and 

7；" =， I 气鄹 I 〜 ,IKI. 

But the Kolmogorov strong law of large numbers applies to successive 

a • s * 

sums cr n of iid r.v.’s 了 i — 1 ，了 2 — 1, • • • with mean 0, that is, 一 0 
as 念 — 00 • Thus, for any a > 0, upon letting in (1) n — ① then € — 0, 


limsup P(T n > a) ^ P(ea > a) -^0 
and T n -^> 0 obtains. 

By using Brownian embedding one could prove directly the in¬ 
variance principle on 6 (42,2D). The same procedure yields also a modi¬ 
fied invariance principle which sometimes is more convenient ： The one 
onC was established by first interpolating, linearly, successive sums so as 
to deal with sample continuous r.f/s X n * But this approach is somewhat 
arbitrary and may be avoided working with the more direct r.f. s Y n 
defined by Y n (t) = S [nt] /Vn for 0 ^ ^ 1. However, their sample 

functions are not continuous and, thus, the space 6 cannot be used. In 
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fact, these sample functions belong to the linear space D (containing C) 
of functions x = (x(/)，0 ^ ^ 1) which are right continuous and have 

left limits: 


x(t) = x(/+) = lim ^(i) for 0 $ / < 1 ， 
x(t-) = lim x(s) exists for 0 < / ^ 1 , 

5 f < 

and we set x(l) = x(l —）； note that these functions are bounded. 

The space D can be given (Skorohod) a metrizable topology which 
makes it complete and separable. Then the Prohorov approach applies: 
An analogue of the Arzela-Ascoli theorem is obtained for the space D and 
yields an analogue of 42.2B for r.f/s with sample functions in D, which 
then yields Brownian convergence with an invariance principle in D. 
However, we are concerned only with conditions for £(1^) 一 > £(fV) and 
those can be established directly, using Brownian embedding, as follows. 

Let ^ y belong to D and introduce on it the uniform norm || ^ || = 
sup I x(t) I (< oo since every is bounded) with corresponding metric 

o 犯 1 , • • • • 

\ \ x — y \\ so that its subspace Q has the same metric as previously* 
(Note that with this metric D is not separable!) 

Extend /V， which lives on C, to D by assigning to every Borel set in D 
the /V-measure of its trace on C (which trace is a Borel set in Q and in 
£))• We are now ready to treat the problem : 

Given l(O y 1), let Y n = (F n (/), 0 ^ ^ 1) be defined by Y n (t )= 

S[ nt \/^n. Let W n = (fV n (t) y 0 ^ ^ 1) be defined by W n {t) = Vn 

W{t!n)\ by Brownian scale change invariance 41.1D the r.f/s IV n are 
Brownian* Replace W by W n in A and denote by r n i, t„ 2 , • • • the cor¬ 
responding iid r.v/s. 

Thus, 

n(r n i)y + r n 2 ), * • .) = <£( 6 * 1 ， 6 * 2 , • • •） 
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we have £(Y n ) = £(y n ). It follows that to prove the theorem stated 
below, we can replace Y n by Y n and show that £(Y n ) —^ £(fV) and 
£(A(Yn)) — St{h{lV)) where h are Pw -admissible on D y that is, Pw -a.e. 
continuous Borel functions on D to metric spaces with their <r-fields of 
Borel sets. 

Since PV n and W are both Brownian, the iid r.v.’s r n i, r„ 2 , • • • have 
the same common distribution as the iid r.v/s ri, r〗，• • . ， so that, by b 
with W n in lieu of V ， 

T nn = sup 2 二 厶 o. 

o 純 n 

B. Invariance principle on D. Given |(0, 1), let Y n on [0, 1] be r.f. f s 

with Y n (t) = S[ nt ]/Vn. Then £(D — £(#)，/•« fact £(A(Y„))— 
£(h(fV)) for every Pw -admissible h on D ., 

Proof• According to what precedes, it suffices to prove this principle 
with the Y n replaced by the equivalent r.f/s Y n with 

Y n (t) = W 

By 12.1 A and its Corollary 1, we have only to show that, given a bounded 
continuous function g on D, — £( 《 (^0)，equivalently that 

every subsequence of integers n contains a sequence {n) such that 

But T nn 0 implies that every subsequence of integers n contains a se¬ 
quence {n) such that T n f n f ^ 0. Since Brownian sample functions on 
[0 ，1 ] are uniformly continuous，it follows that 

sup I W{r n ， x + • • • + 〜 m)〆- PV(t) I n* 0, 

that is, almost all sample functions of Y n \ converge to those of W uni- 

- as . .. 

formly on [0, 1]. Therefore g{Y n , ) ^ gW), a fortiori 

L(g(Y n ^))^ L(g(W)) f 
and the proof is concluded. 

In a completely different direction—-from convergence of laws to al¬ 
most sure convergence, Brownian embedding yields the theorem below 
which is essentially the start of Strassen’s LIT invariance investigation. 

C. LIT comparison theorem. Given |(0, 1), as t — ① 

( 知】 一 轉 ) /V2/loglog/ n-o. 
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P roo f* When / — 00 , [/]// — 1 and, by the strong law of large num- 
bers ， <r ⑴ // A 1 . Thus, given q > l y there is an a.s. finite r,v, r q such 
that, for t > r qy 


= ^[i] = iq 

hence 

I W m ) -轉丨刍卻） 

where 

2 (/) = sup{| X(s) - X(t) \u/q ^ s ^ tq). 

For q n ^ t ^ q n ^\ by triangle inequality, 

z (0 ^ sup{| X(s) - X{t) I ： q^ 1 ^ s ^ q^) ^ 2Z n 

where 

2 n = SUp{| X(S) — 疒一 0 I : 疒 — 1 S S ^ 9 n + 2 }， 

while g(() ^ g{q n ) so that 

Z{t)/g(t) ^ 2Z n /g{q^) 

and, q > 1 being arbitrary, the assertion will follow by proving that，as 
9—1 ， 

limsup Z n /g{q n ) —► 0. 

n 

Let q n + 2 — q n ^~ l = 2e 2 q n so that, as 9 — 1 ， 

€ = 一 1/^)2 — > 0 . 

We are now on usual grounds. By normal approximation and extrema 
bounds in 41.2 ， 

P(Zn/g(q n ) > e) 


^ 2P(| /F ( 疒 + 2 ) — fv{ q n-i) I > 2v /^T+ 2= fl)l 0 gl 0 g 疒 ) 


Thus, by the Borel-Cantelli lemma, 


P(Z n /g(q n ) > e i,o.) = 0 


that is, a.s. 


limsup Z n /g(q n ) ^ € 

n 

and, letting q — 1 ， the proof is concluded. 


S v 2 /^/ n 2 (\ogq) 2 . 
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The (Hartman-Wintner) LIT for ^(0, 1) is that a.s. 
limsup S n /g{n) = 1, liminf S n /g(n) = — 1， limsup | S n | / 《 (” ） =1. 

The LIT comparison theorem C yields at once 

Corollary. The LIT for ^(0, 1 ) and for Brownian motion entail each 
other and follow from LIT for any specific 专 (0 ， 1 )， say、the coin-tossing 
random walk. 

42.4. Some specific functionals. So far we emphasized, as usual, ideas 
and methods. We shall now give some applications to limit distributions 
of specific functionals. 

Recall that 专 (0, 1 ) is a random walk ( 专 ! ，专 2 , • • •）with common expecta¬ 
tion 0 and variance 1 ， and successive sums, = 0 , • • • ， 6 , n = ^i + 

The ri/s X n = (X n (t) y 0 ^ ^ 1 ) are defined by 

X n {t) = S[nt\ + {nt — [tit]) 

and their sample functions belong to C. 

The r.f/s Y n = (Y n (/)，0 ^ ^ 1 ) are defined by 

^n(^) = ^[n<] / 

and their sample functions belong to D, We use invariance principles on 
e and on D: 

£,(h(X n )) — for every Pw-a •匕 continuous Bor el function h on Q 

to a metric space. 

£(h(Y n )) — £(A ( 約 ） for every continuous function h on D to a 

metric space. 

Two approaches are available: Given a functional A on 6 or on Z) 
either Z{h{JV) is known or relatively easy to find so that the limit dis¬ 
tribution of £(h(X n ) or &{h{Y n )) expressed in terms of the obtains 
or a suitable random walk for which £(h(X n )) or £(h(Y n )) is known or 
relatively easy to find so that, by passage to the limit, £(h(fV)) obtains. 
Among such random walks is the coin-tossing one: the common law 
assigning pr. 1/2 to values +1 and — !• 

The following examples will serve to illustrate these approaches. 

Arcsine law. We use invariance principle on D. Let h = X{/: 0 ^ 
t ^ 1 and x(t) > 0}, x C D. It is a Borel function on D to R and, 
by 41.4C, it is /V-continuous. If v n is the number of those of the n first 
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successive sums which are positive, then h(Y n ) = v n /n. Andersen Arc- 
sine law applies to the coin-tossing walk. Thus, & {v n /n) converges to the 
Arcsine law defined by its pr. density 

p{a) = \/TrV~a(\ — a) y 0 C ^ C 1> 

and its d.f. is given by 

2 . 

F(a) = - Arcsine 

ir 

Since this limit law is £(h(fV)) y it is the Arcsine law due to P. Levy, that 
of the total amount of time above 0 spent in [0, 1] by Brownian motion. 

From now on, we use the invariance principle on Q without further 
comment. 


Let h(x) = x(l) so that X n (t) = Sn/^/n and the normal convergence 

theorem obtuttis since £(S n /\fn) > £(^(1)) = 91(0 ， 1). This very par¬ 
ticular case of the invariance principle shows how far-reaching it is, 
since this principle results from the above normal convergence. 

Extrema laws. Let 


and recall that 


m n = min S ky 


M n = max Sk 

l 


m(t) = min M{t) = max 

The functions hi and on Q to R and on e to 尺 ， defined by 
h\(x) = min (/)， h^{x) — max x(t) y hz{x) = (max x(t) y x(l)) 

o 犯 i 0^<<l o^x 

are /V-continuous hence 


£(m n /Vri) — £(w(l)) ， £(M n /Vn) — £(M(1 ))， 

where the limit laws obtain by 41.4c and E. 

These limit distributions are but particular cases of the limit distribu¬ 
tion described by (I) and (//) below 、obtained by using the coin-tossing 
walk as indicated in what follows. 

The function A on C to R z y defined by 

h{x) = ( min x(t) y max x(l)) 
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is continuous so that 

(I) £(m n /V^ y M n /V7i y S n /Vh) ^ £(m(l), M(l) y PV(\)). 

Let 

pn{k) = P{S n = k) y p n (a y c) = P(a < m n ^ M n < b y S n — c). 

The relation to be established in the coin-tossing case is 

( 1 ) Pn(a, b y c) = E Pn(c + 2j{b - a)) - E PrXlb - c 

• • — 

j a — 00 J ^ — 00 

+ 2j{b - a)) 

with integers a ， 彡 〆 such that 

a ^0 ^ b y a < b y a ^ c ^ b. 

Proceed by induction: For n — 0 this relation obtains by direct con¬ 
sideration of cases. Suppose it holds for n —— 1. If a = 0 then both sides 
in (1) vanish — the right side because p n (k) = pn{^k)\ similarly if ^ = 0. 
Thus (1) holds for « if ^ = 0 or ^ = 0. Since then a + l ^ 0 or ^ — 
1 ^0, the relation holds with the arguments replaced by « — 1 ，沒 一 1， 
彡一 1〆 一 1 and by « - 1， a + 1 ，彡 + 1〆 + 1. Then (1) obtains by 
using the immediate recurrence relations 

pn{k) = ^ pn^l(k — 1 ) + ^pn-l(k + 1 ) 

and 

Pn(^y C ) ^ — 1 ，々 一 1 ， （ 一 1 ) + ^ Pn-\{fl + 1 ，彡 + 1 〆 + 1 ). 

By summing over c given a S Q S b and a ^ C\ < C 2 ^ (1) yields 

(2) P(a <m n ^ M n <b,ci< S n < c 2 ) 

= X P(^i + - a) < S n < c 2 + 2j(i? - a)) 

j ss 00 

- 53 P(lb 一 r 2 + 2j{b - a) < S n <2b - ci + - a)). 


Replace a y C\ y C 2 by [aV77], — [— 》 V^7] ， [c iVw], — [^2^17], respectively. 
It is possible to pass to the limit termwise in (2) so that 
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(II) P(a < m ⑴刍 M ⑴ <b、 Cl < W{\) < C2 ) 

+ 00 

L 尸 (q + 2i( 々一 a) < WiX) < r 2 + 2j(b - a) 

^ 00 

00 

- - c 2 + 2j{b - a) < W{\) <2^ - c x + 2j{b - a) y 
where £(/F(l)) =91(0, 1). 


COMPLEMENTS AND DETAILS 

仙 rj.’s are separable. Unless otherwise stated, W T = {W h / ^ 0) or {W{t) y 

’ — 0)， r = [0， oo )， " 泛 Brownian motion with <r-fieIds (B r = (©<, t ^ 0), and 
notation is that of the chapter. _ 

1. Some transformations, a) Let Y{t) = e^ t fV{e 4lt ) y / > 0, 

Z(/) = W{t) -tfV{\) y 0 ^ ^ 1. 

Then 

EY{s)Y{t) = e^ 8 ~^ y EZ{s)Z{t) = (1 — s)t for t ^ s. 

b) Let U{t) = fV(s) ds y V{t) = expjrJ' si s )^{ s ) c £2 R y g continuous 
on (0 ， oo). Then 

EU{t) = 0, EU\t) = / 3 /3, 

E expjrj^ fV{s) = ^ fi2<3/6 , E expjr J' sfV(s) ds^ = e c2tblu 
and, in the general case, 

£〆(/)= exp|r 2 j^ 咖 )(1 ug{u) du 、 

c) \ fVr \ is the reflected at the origin Brownian Wr y with 

E\ W{t) I = VTiJ7 y <r 2 (| W{t) 1) = (1 — 2/tt)/ 

d) (e w ^\ / ^ 0) is called geometric Brownian. Then 

Ee w ^ = e t! \ £>_) = 〆< ， ^( e wu)) = e ti^ e t _ 

Find its so-called “diffusion coefficients” (for = 少）： 

lim £(，(<+*) — e w ^\e w ^)/h y lim 五 {(〆(<+« — e w ^Y\e w ^\/h. 
hio ^ i o 

2. Brownian second-order properties. JVt is a r.f. of second order. Use 
systematically every applicable to IVt result in Chapter XI， and to Comple¬ 
ments and Details, in 38.3C and in Complements and Details 5 to 8 of Chapter 
XII. Examples: 
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a) Let Lt = (fi, (3t, P) be the space of equivalence classes of square inte- 

grable r.v.’s on Q. The ^ 2 -norm is || IV{s) ― || ~ V| s — / |, is a con¬ 

tinuous curve in Ln, and the subspace L^{fV t) spanned by the IVt is separable. 

b) Let s < t < u. E{JV{t) | fV{s) y JV{u)) is the projection of W[t) on the 
plane in L^fJVr) spanned by fV(s) and W{u). In fact, a.s. 

W{t) = E{W{t) I W{s),W{u)) + 〆/){(/) 

where {(/) and {fV{s) y fV{t)) are independent, £ ({(/)) = 31(0, 1), 

<r(/) = ((/ 一物一 /)/(« — 

and Ei}V{t) \ fV{u)) is obtained by the linear interpolation of W{s) and 

IV{u) given by 


E{W{t) \W{s),W{u)) - PV{s) + ^ W{u) a.s., 

u — s u — s 

$(/) is independent of {fV{r) y r ^ (s 9 u)) and for disjoint intervals with 
s < u < s ; < u f y ({(/), s < t < u) and ($(/’)， 〆</’<«’）are independent. 
Furthermore, for s ^ t ^ u y the square of the covariance 

En{t)H{t f ) = ((/ - s){u- /W — 物一 0) 1/2 

is the anharmonic ratio of j ，《，/，/’• It is invariant under projective transforma¬ 
tions of [0, oo) and so is £({(/), / ^ 0), provided [0,》) is transformed into itself. 

In particular, Z(JV{t)/^/0 < / < oo) = £(V7 W(\/t)^ 0 </<«>) ‘ 
c) Construct W^ %T \\ Use 37.5B and verify that 


j A / = 


r% OO • A 

st t 2 ^ sin ns sin nt 

TT n — 乙 - 1 

2 7 r n-i n 2 


where the series converges uniformly and absolutely on [0, w] X [0, ir]. Then 


翊 ㈣ 匕， 

7T ft 

where {o, {i，••• are iid with common law 91(0 ， 1)，is W{t) on [0, w] provided 
sample continuity is proved: take partial sums, show that they converge uni¬ 
formly on [0, w] outside a null event, and then eliminate it, as usual. 

d) Examine the r.f/s in 1 in Z ， 2 -terms. 

3. Martingales. (fF h (& h t ^ 0) and i}V\i) - /, / ^ 0) are martingales. 

Also 

(y c (/, (Bt, / ^ 0) where Y c {t) = exp|^(/) - ^ ^ 2 j 

is a martingale for every c > 0 and so is (1V(/)，©<，/ ^ 0) where Y e f (t) = 
Y c {t) + Y^ e (t). 
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a) The Kolmogorov inequality for Brownian motion is 

sup I W{s) I ^ e) ^ //e 2 

and yields 

^^(/)// 0 as / ■► + oo : 

Take / = 2 n , « = 2 2n/3 and use Borel-Cantelli. 

b) Let r = ra t b- Then 

E{W\r An) - r An) = E{fV\0) 一 0) = 0 

hence 

E(r A n) = E(fF 2 (r A n)) S (卜 | +》) 2 

and 

. r n 

Er = lim P(r > t) dt = lim E(r A n) 蠤 （| a 丨 +》) 2 < oo • 
n Jq n 

c) Let JV T a be a Brownian r.f. with drift a 〆 0. The martingale with Y c {t) 
becomes a martingale with 

Y c a {t) = exp {，《(/) —卜 + ^ A)} 

Take cq = —2a so that Y a (t) = exp {ro/^ a (/)). It follows, using the corollary of 
39.2A, that 

1 = E(Y CQ a (r)) = cxp{coa}P(r ^ a) + exp{r 0 ^)P(r = b). 

Compute P(r = a) and P(r = b). 

Let a < 0 and M a = max fV a (t). Then 

oS< 00 

P{M^ ^ c) = ^ 2,alc , c ^ 0. 

Let r = r c with r > 0. Then the pr. density of r c is given by 

exp{ 一 f 2 /2/j， / > 0. 

V2^/3 

Let r = r 一 a ， + a ， a > 0. Use the martingale Y c f {t) with c : =VYx to prove 
that 

Ee^ T = l/cosh(^v / 2\). 

4. Cogburn and Tucker quadratic variation. The Brownian quadratic variation 
is generalized for continuous decomposable processes Xr y say, T = [0,1] with i.d. 
law (a Ty ^ r ); it is assumed that a T is of bounded variation. Let the limits be 
taken along sequences of partitions 0 = / n0 < ♦ . * < t n k n = 1 ordered by refine¬ 
ment with max {t n , k 一 t n . * 一 i) — Then 

k^k n 
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a) f (dX t ) 2 = lim X nk 2 = <r 2 + J t 2 a.s. 

J T n—♦« k ‘ 

where <r 2 is the variance of the normal component and the sum is that of squares 
of the jumps of Xt* 

b) If ^ on is continuous with ^(0) = 0 and has a second derivative at 0, 
then a.s. 

f g(dX t ) = lim E g{X n ；) = g f (0)X T + ⑼ + L ig(7d - /(0)7 小 

JT n-*« k l 

What if Xt = is a Brownian motion? 

5. £(w(/), M(/), W{t)). Let a < 0 < 々 〆 = 々一 〜/> 0 ， 

a) Let S (Z[a y b\ be a Borel set, and let 

1 + 00 

g(v) = —= ^2 (exp{ (—y — 2ncY/2t) — exp { — (v — 2a + Inc) 2 /It)). 

wl/Kt n~~cx) 

Then 

P(a < m{t) ^ M{t) < b y W{t) CS) = ( g(v) dv. 

Js 

The proof is based upon the reflection principle used repeatedly. 

b) Deduce the individual laws of these 3 r.v.’s，then their two by two joint 
laws, and compare with the results in 41.4. 

c) Find the conditional law of M{t) given M(j) — JV{t) = 0 and show that 

P{M{t) > a\ M(t) = fV{t) = 厂 a ” 2< . 

Let JV\t) = fV(\) — W{\ — /), 0 ^ ^ 1; W f \^ A \ is a Brownian motion. Then 

P{W{\) ^ W{t) ^ 0, 0 g g 1) = ⑴ s d w\\) 

= M f {\)) = 1 一 厂 a2 ’ 2 . 

6. Passage times. The passage times are times r c = niin {/; tV{t) = c} y c € R. 
The r.f. (r c , r ^ 0) is one 彎 sided stable r.f. with “exponent” 1/2 and “rate” VT ， 
that is，it is decomposable with 

P(r c+h — r h ^ t) = P(r c ^ /) = ^ 厂 3/2 厂 c2/2 • 心 • 

To prove, show that what follows holds. 

a) P( Tc g /) = P{M{t) ^ c) is the above probability: use change of variable 

in the integral giving P{M{t) ^ c) in the text. Also 

P(r 0 < c 2 u) = j e - ” 2v y 一 3/2 dv y c > 0 y u > 0 y 
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and £(r c / c 2 ) = £(ri) is independent of c. This obtains a priori from the fact 
that the nature of the rX fV(() y t ^ 0, hence that of M{t) y / ^ 0 and r c , ^ ^ 0, 
does not change when simultaneously c and / are changed into and 入 V ， respec¬ 
tively. 

b) (r c ，r ^ 0) is decomposable with 
\l/(u y c) = log Ee iuT c 

=( - 1 + i)cVtT^ (e {uc - l)u^ 2 du, c > 0 y u > 0: 

Show that £(r c ^h — r h ) = £(r c ), use what precedes, note that r e does not 
decrease as c increases and that the decomposable r.f. has only positive jumps. 

c) (r c , c > 0) is a sum of positive jumps. For every interval of length c y the 
number u(h) of jumps whose height exceeds A is a Poisson r.v. with expectation 

~ y — f «~ 3/2 du = c^/2 / irh 

V2t Jh 

and the total length A of jumps exceeded by h has for expectation 


Furthermore, setting c\^2/w = s y the Law of Large Numbers yields 

v(/i)/Vht^s y A/\^h a As. 


7. Zeros. Let s ^ t ^ u, 

a) The conditional pr. P(fV(t) has at least one zero between s and u\ X t = c) 


V 27T Jo 

It follows that 


v~ z/2 dv. 


P(fV(t) has at least one zero between s and «) = - ArccosVV/«. 

7T 

b) Let r be the largest zero oi Wt not exceeding /, and let 〆 be the smallest 
zero oi Wt exceeding 人 Then 


P(r < s) 




2 _ 

一 ArcsinA/^ P(r < u) 

7T 


= — Arccosv^/w, 

7T 


2 _ 

P(r < s y r f > u) = 一 Arcsin V7/«. 

7T 

c) P{M{t) — W{t) has at least one zero in (s y u)) = (2 / tt)A rccosV s/u. 

If r is the largest zero of Mr 一 fVr not exceeding / and r is the smallest zero of 
Mt 一 JVt exceeding /, then £(r) = £(r) and £(r ’） = £(〆）. 
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AN EXTENSION OF DONSKER’S THEOREM 

NOTE by L. Le Cam 

Donsker’s theorem, Section 42.2, may be extended to cover cases where 
variances do not exist. Since passage from the symmetric cases to nonsymmetric 
ones present absolutely no difficulty, we shall describe only the symmetric 
situation. 

Consider a double array 专 nk ; k = 1 ， 2, • • * ， k ny n = 1 ， 2, * as in Section 42.2 

but subject to the following assumptions: 

a) For each n the ^ n k are independent 

b) £ (专 n Jk) = £ (—专 nJt) 

c) If (Tnk 2 = E min(l, U), then sup <r n k 2 —> 0 and 

k 

d) 5Z (Tnk 2 = 1. 

k 

Construct a process X n {t) by linear interpolation exactly as in Section 42.2 
but with times / n jt computed with the present definition of the <r n k 2 - Let S n 
be the last sum S n = S n k n and let W be the Wiener process on [0, 1]. 

THEOREM. Assume that the conditions a, b, c, d above are satisfied. Then the 
following statements are equivalent. 

i) There is a tight sequence {G n ) of Gaussian distributions such that the Levy 
distance between Ju(S n ) and G n tends to zero. 

ii) £(Z n ) 

The proof uses a rougher interpolation as follows. For each integer n con¬ 
struct times r{n y r y j) starting with r{n y r, 0) = 0. The value 丁 (n ， r y j) will be a 
certain t n k for k = k{j). 

If r{n y r y j) has been chosen, equal to some /« *(，）then r(n y r 9 j -f 1) is equal to 
tnk for the first k > k(j) for which t nk > t n ku) + OA). .. 

Define a process Y nr (t) y / C [0, 1] by taking Y n r{t) = X n {t) if t is one of the 
times r(n y r, j ). Interpolate linearly between the values so obtained. 

The essential result is as follows. 

LEMMA L Let conditions a, b, c, d be satisfied and assume that statement (i) 
of the theorem holds• Then、for every € > 0, there is an integer r and an integer 
n(i) such that 

P{sup I X n (t) — Y n r{t) I > €) < e. 

t 

Proof. Let n = r(n y r y j) and r 2 = r{n y rj + 1) be two successive points of 
the rough division. Note that by symmetry 

P { sup I Xn{t) — Ynr(t) | > € ) < 2P { | ^n(r 2 ) — X n (ri) | > 

ri<t<r 2 

The difference involved here is a certain block sum B(n y r, J) = 2Z 

k 

k(j) < k < k(j + 1)}. It will be sufficient to show that one can select r so that 
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eventually S P{| B(n y r y j) | > e\ < If not, there would exist an eo > Osuch 

i z ~ 

that for each r there is some n{r) > r for which 

L P{| B(n(r) y r y j) | > e 0 } > e 0 . 

i 

Now for the block in question the sum of truncated variances 2Z {<r n k 2 ； 
^(j) < ^ < k{j + 1) } is at most r — 1 + sup <r n ^- 

• k 

Thus the variables B(n(r) y r, j) are still uniformly asymptotically negligible. 
The existence of the e。 〉 0 would then violate the Normal Convergence Criterion 
(page 328 ， Vol. 1). This completes the proof of the lemma. 

The lemma may be restated noting that the interpolation procedure used to 
pass fromX n to F„ r isa certain linear transformation of C[0 y 1 ] into itself, say M r . 

The conclusion of the Lemma 1 can then be restated as the assumption (i) in 
the following lemma. 

LEMMA 2, Let X ny w = 0, 1, * and fV be random elements with values in 
C[0, 1], Assume 

i) There are linear transformations M r of C[0,1 ] into itself such thatforeach c > 0 
^ere is an r > e 一 1 and an n(e) such that n > n(e) implies P{|| (I — M r )X n II > 
e} < 6. 

ii) For each r the Prohorov distance between St{M r X n ) and £(M r fF) tends to zero. 

iii) lim St{M r W) = 摩 ). 

Then £(X n ) — 

This is easily proven by looking at differences of expectations Eh(M r X n ) — 
Eh{X n ) where A is a bounded uniformly continuous function defined on C[0, 1]. 
Condition (i) makes these differences small. Condition (ii) makes the differences 
Eh[M r X„) — Eh{M r fV) small and condition (iii) makes the differences 
Eh{M r W) - Eh(fV) small. 

This is sufficient to imply convergence since each bounded continuous function 
is at the same time the pointwise supremum of a countable set of uniformly con¬ 
tinuous functions and the pointwise infimum of another such set. 

Here we have just proved that (i) of the Lemma holds. That (ii) holds follows 
from the usual finite dimensional central limit theorem. Part (iii) may be 
proved by noting that the argument of Lemma 1 applies just as well to the 
Wiener process itself. 

With appropriate modifications the Theorem admits extensions to the case 
where the variables ^ n k take values in a separable Banach space or its weak 
second dual. Statements of this nature can be found in “Remarks on a theorem 
of Donsker and Kolmogorov” by A, P, Araujo and L. Le Cam. 
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Markov dependence was introduced by Markov (1906) as a natural 
extension of independence such that the asymptotic properties of sums 
of r.v.’s，say the law of large numbers and the normal convergence, may 
be expected to continue to hold under reasonable restrictions. As the 
restrictions were gradually removed and the setup expanded, new types 
of behavior and problems appeared. Independently, similar ones ap¬ 
peared in physics (Chapman, Fokker, Planck, etc.). 

In his fundamental paper (1931) Kolmogorov introduces, rigorously ， 
Markov dependence in the continuous case and shows that the transition 
pr.’s satisfy certain differential or integro-differential equations under 
various restrictions (primarily of the Lindeberg type for normal con¬ 
vergence — on the conditional moments — or of the continuity type 
P[Xt^h = Xt] 1 as 々 0 ). The leit-motij is the search for local 
characteristics. Feller explores this new field of research in two series of 
basic papers essentially of a purely analytical character. First (1936, 
1940) he pursues Kolmogorov’s approach through local characteristics 
of transition pr.’s，investigating conditions under which there do exist 
transition pr/s with various local characteristics. Then (1952 on) he uses 
and expands the semi-group theory (created by Hille and Yosida and irru 
mediately applied by them to Markov evolution in time) and introduces 
and analyzes “Feller” or “stable Markov” processes; his work is at the 
root of novel developments in Markov processes. 

Meanwhile Doblin (1938—39) proceeds to a direct sample analysis under 
a uniform continuity condition on transition pr/s which leads to step 
sample functions; Doob (1945),-upon removing the uniformity restric¬ 
tion, discovers sample discontinuities more complicated than jumps; 
P # Levy (1951) flushes the then monstrous sample possibilities into the 

open; and Kinney (1953) investigates sample continuity properties. 
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MARKOV DEPENDENCE 


43.1. Markov property. In order to describe and analyze Markov 
dependence in intuitive yet precise terms and also to simplify the writ¬ 
ing, we recall and add some terminology and notation. 

Let r, / with or without affixes denote the elements of a linearly 
ordered index set T. To fix the ideas, we take T to be a set of reals with 
the usual ordering, say, the set of all reals or of all integers or of all non- 
negative reals or of all nonnegative integers; in fact, we shall end up by 
taking T = [0, oo). It is convenient to give the indices a phenomenon 
logical meaning: they will represent moments of “time.” Thus, an 
ordered triplet U < F < fV of disjoint index subsets (that is, every 
element of U precedes every element of which precedes every element 
of IV) represents a “past” t/，a ‘‘present’’ 〆， and a “future” fV• 

R.v.’s X y Yy • • * with or without affixes are measurable functions on a 
pr_ space (fi, (i, P) to a measurable space (9C, S) or state space. In 
general, the r.v.’s take their values in a topological space 0C, and S is 
the (T-field of topological Borel sets, that is, the onfield generated by the 
class of open sets. Most of the considerations in this chapter remain 
valid for more general state spaces and especially for locally compact 
separable metric state spaces and, thus, are frequently couched in 
general terms. However, to fix the ideas we assume that unless other¬ 
wise stated our r.v/s are numerical: the state space is a Borel set of the 
(extended) real line together with the cr-field of its Borel sets. In gen- 


The purely analytical and the sample analysis gradually merge. 
Fortet early (1943) examines sample continuity in connection with the 
Kolmogorov-Feller approach. Later (1955) in connection with the Hille- 
Yosida-Feller semi-group approach, Neveu combines the two lines of 
attack. At the same time, Dynkin begins a series of basic papers in 
which he establishes and extends Feller’s results and pursues an investi¬ 
gation of various Markov evolutions in time by means of an intimate 
blend of sample and semi-group analysis; for this purpose Dynkin and 
Yushkevich isolate and analyze the concept of strong Markov depend¬ 
ence, first mentioned by Doob (1945)，and which is essential for the 
sample analysis of Markov processes. Hunt (1957) in a basic series of 
papers connects potential theory with Markov processes. 

Let us only mention another line of attack by difference, differential, 
and integral stochastic equations in the r.f/s themselves, now in the 
process of growth, thanks to the works of Bernstein, P. Levy, Doob, 
Maruyama, and especially I to (1951). 


3 

4 

so 





290 


MARKOV PROCESSES 


[Sec. 43] 


eral，the extension to the A^dimensional real spaces will be trivial. 
But the reader is invited to transpose what follows to more general state 
spaces. 

ffi with or without affixes will denote sub <r-fields of events B with same 
affixes if any, unless otherwise stated. In this subsection only we denote 
by (S>t the union cr-field of the ($> h t C T\ that is, the smallest cr-field 
containing all of them. Note that the finite sums of finite intersections 
Bt x Pi • * * Pi Bt n of their events over all finite subsets of T f form 
a field 6 which generates ©r'—the smallest monotone field contain- 
ing 6 . In connection with this construction we shall use the properties 
of c.pr/s and c.exp/s without further comment, ©r^rneasurable func¬ 
tions whose expectation exists will be denoted by Yt^ unless other¬ 
wise stated. 

Finally, in accordance with 25.3A, we introduce the following equiv¬ 
alent terminologies for conditional independence 

©! and © 3 are c.ind. given © 2 -* j ffi 2 ) = P{B\ | (S> 2 )P(Bs \ © 2 ) 

a.s. 

is c.ind. 0 / ($> 3 given (S > 2 ： Pd (B 23 ) = P(Bi ffi 2 ) a.s. 

ffi 3 is c.ind. of ®i given © 2 ： P(5 3 ®i 2 ) = ® 2 ) “丄 

A family t T y of <r-fields of events is said to be Markovian or a 
Markov family if the following equivalent forms of Markov property 
hold: 


For any pasi y present^ and future 

(M future ) The future of the family is c.ind. of its past given its present: 


P (^future ®present-hpast) = 户 ( 石 future ®present) a . s . 


(M past ) The past of the family is c.ind. of its future given its present: 


户 ( 乃 past I ®present+future) = ^(^past ®present) 


(M past) future) The past and the future of the family are c.ind. given its 
present: 


尸 ( 乃 past 石 future 1 ®present) = 户 ( 石 past | ®present)^(^future I ®present) a * s * 


It will be convenient to have at our disposal seemingly weaker and 
seemingly stronger yet equivalent forms of Markov property. The 
lemma below permits the passage from events to functions. 


a. Let © 1 , © 2 , ©3 be -fields oj events and let ® 3 be generated by afield 6 
of finite sums of events D belonging to a class 3D. 

If P(D I © 12 ) = P(D I © 2 ) a.s. then P(B S | = P(B Z | © 2 ) a.s. and ， 

more generally ， £( 1^3 | ®i 2 ) = £(^23 1 ® 2 ) a 丄 
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Proof ， The class of events B such that P(B \ CB 12 ) = P(B \ (B 2 ) a.s. 
is closed under countable summations and contains 2) hence contains C. 
It is also closed under monotone passages to the limit hence it contains 
the monotone field CB 3 generated by the field C. This proves the par¬ 
ticular assertion. 

Since the family of functions Y (whose exp.’s exist) such that 
E(Y\ (B 12 ) = £(y|(B 2 ) a.s. is closed under multiplications by the 
indicators Ib 2 and we just proved that it contains the /丑 3 , it follows that 
it contains the Ib 2 b z = IbJb v Therefore, by the particular assertion, 
it contains the /b 23 . But this family is also closed under linear combina¬ 
tions whose exp.’s exist and under passages to the limit by nondecreasing 
sequences of its nonnegative elements. Thus it contains the simple 
(^-measurable functions then the nonnegative CB 2 3-measurable func¬ 
tions and finally all the (^-measurable functions whose exp.’s exist. The 
proof is terminated. 

The next lemma permits the extension of the futures. 

b. Let (Bi, (B 2 , CB 3 , CB 4 be <x-fields of events. If (i) P(B 3 \ CB 12 )= 
P{Bz I CB2) a,s. and (ii) P(5 4 | (B123)= 户(万 4 | ®3) 泛丄， then P(5 34 j CB 12 ) 
= P(5 34 I (B2) a.s. and、more generally^ E{Y^\ \ (B 12 ) = E{Y^ \ CB2) a.s. 

Proof. According to a, (i) yields 

(1) £(1^23 I ®12)= 五 (h3 I ®2) a.S. 

while，upon multiplying by Iq x and conditioning by (B23, (ii) yields 

(2) £(/j9 s /j5 4 j (B 23 ) == I ®3) = I ®123) H 

But we always have 

⑶ E{Iq^b a I ®i2) = E\E{Ib z b a I ®i23) I CB12} a.s. 

where，by (2)，the right side reduces to E {E{Ib z b a \ ®23) 丨 ®i2! a.s. and 
the c.exp. within the last expression is ®23-measurable. Therefore, by 
(1)，the right side of (3) reduces to 

E\E{I Bi b a I ® 23 ) I CB2} = E{I Bz b x I ®2) a.s. 

and (3) becomes 

P(5 3 5 4 I CB12) = j CB 2 ) a.s. 

The lemma follows on account of a, and the proof is terminated. 

We are primarily interested in the Markovian evolution and thus ex¬ 
plore the future as time increases. Therefore, we shall use primarily the 
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corresponding form (A/f uture ) of* the Markov property and be content 
with stating its equivalent formulations. Besides，the equivalent formu¬ 
lations of the other forms will then follow at once on account of 25 - 3 A. 

A. Markov equivalence theorem. The following equivalent relations 
characterize the Markov property: 


(M) 户 ( 万 future I ®presentpast) = P ( 忍 future | ^present) OT 

(M ) ^present-(-future | ®present-fpast) ~ ^(^present-ffuture | ®present) 


for 

(i) any past 、 present，and future or 

(ii) any instant present s、the whole past [r: r < s] y and the whole 

future [t: t > s] or 」 

(iii) any finite past、instant present，and instant future • 

Note that it suffices that 


(M") the c.pr:s of future events given the present and the past depend 
only upon the present: 

尸 ( 及 future I ®present-f -past) ^ ®present~^^^^ ublc . 

For，this property is implied by (M) and, upon conditioning it by (B presen t, 
it yields (M). 

Proof. Since (M) (M r ) on account of a and (M r ) ==> (M) as a 
particular case, it suffices to consider (M). Since (i) (ii) ^ (iii), 
it remains only to prove that (iii) 与 (i). Thus，let U < V < W and 

(M) P{Bw I (S>u^-v) = P{Bw I a.s. 

for every finite t/，singleton and singleton W. Since, by b, we can in¬ 
crease the future point by point, (M) extends to all finite subsets W n 
of any given future, whence, by a, to W. 

Since for a singleton V and for all finite subsets U n of any given 
past U 

f I CB(/ + v) = r P{Bw I ($>Un^-v) = f P{Bw I (By) 

Jb Jb 


for every B G ^>u n ^-v and since the (S>u n ^-v generate (S>u^-Vy it follows 
that the equality between the finite measures on every (S>u n ^v defined 
by the extreme terms extends on Therefore, the integrands 

coincide a.s. for any given U. 
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Finally, (M) extends to any given present V, as follows. If V has a 
last element /, then take C 7 + 〆 一 {/} as the past, {/} as the present, 
and W as the future, so that 

P{Bw I CBt/+r) = P(Bw I (S>t) a.s. 

It follows, upon conditioning by (S>y that 

P{B W I (S>v) = P(B W I ©0 = P(Bw I (S>u^-v) a.s. 

If V has no last element, decompose V into V f + V n where V 1 < V n 
and 厂 has a last element. Then take U as the past, V f as the present, 
and V n + # as the future. According to what precedes 

(Bf'+lO = I (By) a.s. 

and, interchanging the past U and the future V n + W, it follows that 

P(Bu I (S>v^w) = P{Bu I (By/) a.s., 

Finally, conditioning by (By so that 

P{Bu I (By) = P(Bu I (By/) = P(Bu I (&v+w) a.s. 

and interchanging the past U and the future IV so that 

P(B\v I (By) = P(Bw I CBc/+f) a.s., 

the extension of (M) to any given past is proved and the proof is ter¬ 
minated. 

The time set T consists most frequently of all / ^ / 0 with / 0 = 0 and 
the Markov evolution is analyzed from instant presents s to the futures 
t ^ s. Thus, it is primarily the form (M future ) which is used. However, 
difficulties arise at once: The c.pr/s in (M( uture ) may have no regular 
versions, so that we cannot treat them as measures; and even if they 
have, we may not be able to select the regular versions so as to trans¬ 
form all the a.s. equalities into strict equalities, and then we are faced 
with too many exceptional null events since T is uncountable. If none 
of these difficulties arises, so that by a suitable choice of c.pr/s the 
equalities (Mf uture ) become strict equalities and we have no exceptional 
null events, then we say that the Markov property is regular and further 
analysis of a regular Markov evolution becomes possible. If, moreover, 
the evolution is independent of the instant present, that is, the c.pr/s 
are invariant under translations in time，then we say that the Markov 
property is stationary (or homogeneous in time). We shall be mostly con¬ 
cerned with the stationary Markov property. 
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43 . 2 . Regular Markov processes. A r.f. Xt = (X h t G T) is said to 
be Markovian or a Markov r.f. if the family of <r-fields (S>t = (B ( 不 ） = 
Xt^ l (§>) of events induced by the Xt is Markovian. A process is said 
to be Markovian or a Markov process if it consists of Markov r.f/s with 
same induced <r-fields. Since CB(^) = CB(l^) means that there exists an 
invertible Borel g t such that Y t = gt(X t ) and Xt = 兄厂 KD，we may 
consider a Markov process as a Markov r.f. defined up to an invertible 
scale function gr. 

In this connection, it is sometimes convenient to modify the defini¬ 
tions so as not to require invertibility. A Markov r.f. becomes a family 
(X t ， (S>t y t C. T) where the X t are CB^-measurable and the <r-fields ($> t form 
a Markov family so that, a fortiori ， (Xt y / C 7 *) is a Markov r.f. accord¬ 
ing to our definition. Then, a Markov process becomes the family of all 
r.f/s Xt with (Brmeasurable X t y s y where the given (S> t form a Markov 
family. 

For r.f/s Xr y the Markov property (Mii), equivalently, (M’ii) which 
emphasizes the Markov evolution from any instant present becomes 

P(B I X ry r ^ s) — P(B I X s ) a.s” equivalently, 
E(Y\X ry r^ s) = E(Y\ X s ) a.s. 

where B and Y are defined on (X ty t ^ s). In order to examine regularity 
possibilities, we take its most particular case. First, we take an instant 
past jr} so that, upon conditioning by (X ry X s ) y we have 

P(B I X ry X a ) = P(B I X s ) a.s. 

On account of the general smoothing property P(B \ X r )= 

E\P{B I X ry X s ) I X r }y it yields the relation 

P(B I X r ) = E{P(B\X a ) I X r ] a.s. 

Next, we take an instant future {/} so that the relation becomes 

P(X t CS\X r ) = E{P(XtCS\X s )\X r } y r<s<t y 5 G S, 

outside a null event of the form [X r C ^o]> where the Borel set 4^0 may 
vary with r, s y /, and with S. 

If there are regular versions of the c.pr.’s which figure in the last re¬ 
lation, then its right side can be written as an integral. If moreover the 
regular versions may be selected so as to make disappear the excep¬ 
tional sets Sq for all r, s y t and all S y then we have regularity at least in 
the particular case of instant present, past, and future. Thus, the re- 
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quirement that it be possible to transform this relation into strict equality 
is a minimal one for regularity. We then give to the relation the name 
of (Chapman-)Kolmogorov equation and to the c.pr.’s therein the name 
of transition pr/s. More precisely, a function S) defined for all 

pairs t) C. T X T with ^ < / and for all x C 9C, 6" C S, is a transition 
pr. (tr.pr.) if it is S-measurable in ^ and a pr. in S and if the Kolmogorov 
equation holds: for all r, t with r < j < /, all and all 

(K) Prti^y S) = JPrsi^y dy)P st {y^ S). 

We add the convention: P ss {x y S) = I(x, 6 > )(= 1 or 0 according as x S 
or x ^ S) so that the tr.pr/s are now defined for all pairs (j, /) with 
s ^ t y and then the Kolmogorov equation clearly holds with r ^ s ^ t. 

The Kolmogorov equation can be given a seemingly stronger form 
which parallels the passage of (M) to (M’)，that is, from events to func¬ 
tions. Let G be the space of all bounded Borel functions on 9C (so that 
all integrals below exist and are finite). The relation 

TuvgM ^ JPuv(^y^y)g(y)y U ^ V y 

defines a function T uv g which clearly also belongs to G. In particular, 
for g(x) = I(x y S) we have T uv g{x) = P vu (x/S) y and T ww = I 一 the 
identity operator on G. Thus，it defines a /r. operator T uv on G (to G) 
which is an extension of the transformation of indicators I{x^ S) into 
tr,pr/s S). Furthermore, according to the Kolmogorov equation, 

for r < ^ 

(T rt g)(x) Prt(x y dz)g{z) = J'J J' Prs(x, dy)P st {y, ^z) U( 2 ) 

=f Prs(x, dy) {J P st (y, dz)g{z) J = {T„(T st<? )}(x). 

Therefore，the family of transformations T uv on G has the so-called 
generalized semi-group property: 

(K ’） T r t = T rs T s ty r ^ j ^ 

Conversely, upon applying (K f ) to indicators = I(x y S) y Kolmogorov 
equation follows. Thus 

a. The Kolmogorov equation for tr.pr.、s and the generalized semi-group 
property for corresponding tr. operators are equivalent. 




296 


MARKOV PROCESSES 


[Sec. 43: 


What precedes leads to the following definition: A r.f. Xt is a regular 
Markov r.f. if there exists a tr.pr. P s t(^y S) such that for all pairs (j ，/) 
with s ^ t 

P{Xt C! ^ I X ry t ^ s) = P s i{X sy S) a.s. 

The r.f. is Markovian, since conditioning by X $ yields 

P(X t CS\X 8 ) ^ P 8t (X 3y S) - P(Xt CS\X ry r^s) a.s .； 

by omitting “a.s.”，we can and do select regular versions of its c.pr.’s 
so as to have a regular Markov property. Thus, the definition of a 
regular Markov r.f. Xt and the choice of regular versions of c.pr/s 
coalesce into the strict equality 

(MR) P(X t CS\X n r^s)^ P(X t CS\X 3 ) 

with the identity 

P(Xt c s I Xs) ^ Pst(X sy S) y 

where the values P 3 t(x y S) of the right side are those of a tr.pr. 

Note that for / = the identity reduces to, hence justifies, our con¬ 
vention for tr.pr.’s. 

From now on and unless otherwise stated the time set T has a first element 
A) ， Pst(A S) is a tr.pr. and the “initial distribution” P to is a pt\ on $• 

Let (9Cr, Sr) = II (9C“SJ be the sample space of a regular Markov 

tCT 

r.f. On account of (MR) the tr.pr. P 8 t(x y S) determines its conditional 
distributions Pt x given = x. For, according to the Tulcea theorem 
8,3A and to 27.2b with g{x u • * *, x n ) = X • • • X Is in ( x n)y h < 

• • • < t ny S t C. ^ty Pt x is determined by 

(CD) P T x C(S tl X • • • X S tn ) 



Ao»l) 


Pt n ^ ^n) • 


It follows that, for the law of Xq given by an initial distribution P toy the 
law of Xt is determined by its distribution defined by 


(D) PtSt - Pto(dx)P T x S Ty S t C^t. 

J9C 

Thus, the c.pr. P x given 不 。 =x on — the (r-field of events 

B = [Xt C Sr] y is given by 

(CP) 


C! = Pf X ^Ty d 
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On account of the integration theorem 26.1A, the c.exp. E x given X to = x 

on the (B(Jfr)-measurable functions ^ (defined on Xt) whose expectations 
exist, is given by 

(CE) E X H = JV ( 匈咖)， 

and, for an initial pr. P toy hence for an initial distribution P to (S) = 

P[X t0 C S ], 

(P): PB ^ J P i0 {dx)P x B y (E )： ^ J P iQ {dx)E x ^. 

What precedes leads us to the following definition ： A regular Markov 
process is a family of regular Markov r.f/s with common tr.pr. and 
arbitrary initial distributions. The tr.pr. P st (x y S) determines by means 
of (CD), (CP), and (CE) the common conditional distributions Pt x , 
c.pr/s P x y and c.exp.’s E x y given X to = x. The choice of the initial 
distribution P tQ determines the law of the corresponding r.f. of the 
process. 

Since the regular Markov property (MR) is in terms of c.pr.’s only, 
we may also define a regular Markov process as follows. Let Xt be a 
family of measurable functions X t on a measurable space (12, ft) to a 
measurable space (9C, S) (with all singletons {*v} G S) and let(P x , ^ C 
9C) be a family of pr/s on the induced (r-field (B = ($>(Xt) with 

P x [X t0 = 4 = P x {X t ^\x}) - 1. 

If for every x G 9C there exists a tr.pr. P s t(x y S) such that (CD) holds 
for Pt x defined by (CP), then we may say that Xt is a regular Markov 
process. For every choice of an initial distribution P tQ on S, Xt becomes 
a regular Markov r.f. on the pr. space (Q, (B, P) where P on CB is deter¬ 
mined by (P). 

The two definitions correspond to the two ways of viewing regular 
processes and, as long as we are concerned with one regular Markov 
process with a given tr.pr., we may use either of these definitions*ac¬ 
cording to convenience. However, this raises a basic question: whether 
given an arbitrary tr.pr. there always exists a corresponding Markov 
process. The answer is in the affirmative, as follows. 

A. Regular Markov existence theorem. To any tr.pr. there cor¬ 
responds a regular Markov process with a determined law of evolution. 

To any tr.pr. and any initial distribution there corresponds a regular 
Markov r.f. with a determined law. 
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Proof. The assertions relative to r.f/s follow from those relative to 
processes on account of the necessary condition (P). The assertion 
relative to the law of evolution follows from the existence assertion on 
account of the necessary condition (CD). It remains to prove the 
existence assertion for processes. The necessary conditions (CD) and 
(CP) show the way: Take (Q, (i) = (9Cr, Sr) with points xr = t ^ T). 
Define Xt = t ^LT) by X t (xf) = x t and using the given tr.pr. 
Pst(x y S) y by means of (CD) and (CP), construct the family of pr.’s 
P x = Pt x , ^ G 9C- The process so defined is a regular Markov process 
with the given tr.pr” provided we can show that for every finite index 
set t\ <•••</„ < s < t and every S t C it is possible to〆 select 
versions of c.pr.’s such that 

P(X t c St I .. •, ^ n , ZO = P 9t (X 3y S t ) 

or, equivalently, 

P x C(S h X - XS tn XS s XS t )^ f P x (dx T )P 8t (x 9 ,S t ). 

J C{S tl X-^XS t ) 

But by the construction of P x , both sides reduce to 

/ P * * * Ptnsi^tny d x s)Pst{^ay 

S tl X-*XS tn XS t XSt 

and the proof is terminated. 

There is a one-to-one correspondence between tr.pr.’s P s t(^y S) and 
tr.dj:s F st x defined by F st x (y) = P 3 t{^y ( —°° ， J)) (when 9C = R). The 
Kolmogorov equation in terms of tr.d.f/s is 

F rt x {z) ^ fF r8 %dy)F 3t y(z). 

Tr.d.f/s F st x are d.f/s with a parameter x in which they are Borelian, 
and there is a one-to-one correspondence between tr.d.f/s F st x and trxhj^s 

f 3t x defined by f s t x (u) = Je iuy F st x (dy). However, in the study of local 

characteristics under some continuity condition we are primarily in¬ 
terested in the Markov behavior in the neighborhood of given states x at 
times s. This leads to the centering at x of the tr.d.f. F s t x and of the 
tr.ch.f.fst hence to the introduction of 

^st x (y) = P(X t — X s < y \ X s x) = F 3t x (x^y)y fst x (u) = e^ lux f st x (u). 
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A trivial example of regular Markov processes and of tr.d.f/s is pro¬ 
vided by processes (X ty / ^ 0) with independent r.v.’s; then F 9t x = F t 
and Kolmogorov’s equation reduces to an identity. An important and 
suggestive example of regular Markov processes and of tr.d.f/s is pro¬ 
vided by decomposable processes with increments X 8 t ； then T st x = F st 
and Kolmogorov’s equation becomes the composition relation F rt = 
F ra * F st . Concepts and problems relative to decomposable processes 
suggest similar ones for regular Markov processes. In particular, the 
concept of law derivative will lead us to the historically important local 
characterizations of tr.d.f/s, as follows. 

We say that a ch.f. f t x represents the /r. law {right) {left) derivative 
given x at / of a regular Markov process Xr if CA— ； 1 , 《 +/) 【 1 /( ；1 + 幻】 
as A + 走 —> 0 with h y k ^ 0 (h ^ 0) (k ^ 0) y h k > 0. According to 
the central convergence theorem, if the limit ch.f. exists then it is neces¬ 
sarily an i.d. ch.f. ft x = , yf/ t x = (a t x y 少尸 )， and the process is tr. law 

(right) (left) continuous given x at /: F t ^h t t^-k x (y) —> 0 or 1 according as 
y < x or y > x. While in the decomposable case we could use the con¬ 
vergence theorem for i.d. laws here we need the more general central 
convergence criterion but for identically distributed summands, that is, 
with F nk = F n ： 

f n n —/=〆.，☆ = (a, if and only if^ n ^ ^ and f^ n M/ x ^ ot 


with ^ n (x)= 



1 + y 2 


dF n . 


It suffices to note that because of F n k = F ny condition (iii) of the criterion 
yields « ( | x dF n ) ^ | x dF n X I n dF n —> 0 so that 

^ lar I <i / lx I << ^ lx \ 


x\ <€ 


• . # # I 八 

in its condition (ii) the left side can be replaced by n I -- dF ny 

Jbl<€ 1 + 

and the assertion follows by elementary computations. 

Upon applying this particular form of the central convergence criterion 
to tr. law derivatives, we have 

b. Tr, law derivative existencp criterion. The tr. law {right) 
{left) derivative e^y ypt x = {ott x y ^t x ) exists if and only if 


^t x and J d^ t ^hA^k x (y)/y 


where 


as h 七 k —> 0 (h ^ 0) (k 


夂一 ; 



-oo 1 + JX 2 


dFt—h* 


t^-k 
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Next we note that, as for ^*(x) = e lux y x ^ R y 

If a function g on R is bounded and twice differentiable then the function 
k g on R X R y giver? by 

( y 、 1 + )2 

g(^ + jv) _ g(^) - 1 g'^)j ~y2~ 

and defined by continuity aty ^ 0 to be h g {x^Q) == 兄 ’’(x)/2, bounded 
and continuous in y for every fixed x. 

The two foregoing remarks yield 

C. If fn U — / = 〆 ，沴 = (a，f )，then for every bounded and twice dif¬ 
ferentiable function g on R and every fixed x C. R 

n ■ + y) dF n {y) - 咖 )} — ag'{x) +Jh g (x, y) d^{y). 

It suffices to apply the Helly-Bray theorem to the left side expressed as 

nj(g(x + jy) - g(x)) dF n {y) = S'( X )J^n{y)/y +Jh g {x,y) d^ n (y). 

B. Tr. law derivatives theorem. Let yp 8 x = ( 戌 A 少 /)，^ ^ 0 
represent the {right) {left) tr. law derivatives corresponding to a tr.pr. 
P st (x y S) y and let g on R be a bounded twice differentiable function. Then 

as h + k 0 {h ^ 0) {k ^ 0) 

' JPa-h.t+kix, dy)g(y) - 咖)1 — ot t x g'{x) +J* h g {x,y) d^f a x {y). 

In particular, for left tr. law derivatives y if P 9 t{x^ S) is twice differentiable in 

• d~ l 

x then its left derivative — P 9 t(^y = Hm - P 9 -hu(^y S) — P 3 t(^y S) ^ 

ds h 

exists and 

d 一 S 

—— Psti^y S) = a 8 X 一 Psti^y S) 
fis dx 

+ + J ，《 S") — Ptt( x y — Y 2 ~ P»t(^y *5*) I ~~ ^ t X {y). 

Proof. The general assertion results from c, and the particular one 
follows upon setting 走 = 0, g{x) = P 8 t(^y S), and using Kolmogorov’s 





[Sec. 43] 


MARKOV PROCESSES 


301 


equation 


P 8 -h t t(x y S) ^Jp 9 - ht9 (xy dy)P 9i {y y S). 


Upon introducing the P. Levy form yf/ 8 x = (a/, (/3/) 2 , L 8 x ) y the fore, 
going integro-differential equation is explicited into 

—■ Psti^y S) = a 8 x P 8t (x y S) + JO ?/) 2 P 8t (x y S) 
ds dx dx £ 


+ 卜办 + •S') — Pstix, S) — i ^ 2 — P» t (xy dL t x {y). 

Kolmogorov’s “continuous case’’ corresponds to vanishing P. Levy func- 

• . . . u 2 

tions L 8 x hence to normal tr. law derivatives yp$ x {u )= = /«/ ⑷ —(V) 2 了 

Feller’s “purely discontinuous” and “mixing” cases correspond to van- 
ishing functions a/, (0 8 x ) 2 and to finite L/(db0), respectively. In the 
first purely analytical approach the study of Markov processes was 
centered about the questions of existence, unicity, and tr.pr. properties 
of solutions of these equations. I to, to whom the foregoing theorem is 
due, answers these questions by solving stochastic integral equations 
under somewhat stringent restrictions. As we shall see later, the semi¬ 
group approach leads to answers under weaker restrictions. 

43.3. Stationarity. To discuss stationarity, that is, invariance under 
translations in time, it is convenient to take once and for all T = [0, oo), 
u y v y r, j, / ^ 0 and to use the terminology and notation of Section 33 
for translates. Let ^ = g{Xr) denote Borel functions g of Xt and let B 
denote events defined on Xt ； the { and the Ib are (B(^0-measurable 
functions. The translate X b+ t of Xt = (X ty / ^ 0) by s is the family 
(X s ^ty ^ = 0) of the translates X s ^_ t of the Xt by s. The translate ^ of 
^ by j is defined by = g(X s ^.T) and the translate B s of B by s is defined 
by Ib $ = (/b ) s ； the ^ and the Ib 9 are (B(^V)-measurable，in fact, 
(B(Jfa-fr)-measurable functions. To avoid ambiguities, it suffices to take 
for pr. space the sample space of Xt (see 33.2). 

A tr.pr. P Ut v(^y is stationary if it is invariant under translations in 
time: 尸 幻 =P w ， v (*v ， 6 1 ) for all X ， 6*. Thus, a tr.pr. 

Pu t v(x y S) is stationary if and only if its dependence upon the time argu¬ 
ments u y v reduces to dependence upon their differences t = v — u only, 
so that a stationary tr.pr. may be denoted by P t (x y S). Its complete 
definition is then as follows: A stationary tr.pr. Pt{x y 6*), / ^ 0, x C 9C> 
6* C S, is measurable in x y a pr. in S y with Po(*v ， 6*) = I(x y S) y and it 
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satisfies the stationary Kolmogorov equation 

(K s t) P^ t {x,S) P s (x y dy)P t {y, S). 

The corresponding generalized semi-group property becomes then the 
semi-group property 

(K’ s t) T s +t = T s Tt 

of stationary tr. operators or Markov endomorphisms T t on the space G 
of bounded Borel functions g on 9C, defined by 

(T t g)(x) =jp t (x } dy)g(y) } T 0 =/, 
and 38.2a becomes 

a. The stationary Kolmogorov equation for stationary tr.pr's and the 
semi-group property for corresponding stationary tr. operators {Markov 
endomorphisms) are equivalent. 

A regular Markov process on T 1 = [0, oo) is stationary if its law of evolu¬ 
tion is invariant under translations in time. On account of 38.2(MR) 
the process is stationary if and only if its tr,pr. is stationary. 

Let the c.pr. be defined by = P x when x y where P x is defined 
by 38.2(CD) and (CP); similarly for Thus, (£ f ) is the c.pr. 
(c.exp.) on a regular Markov process given the initial random value 
X 0 = The Markov equivalence and existence theorems, together with 
(CD) and (CP) and integral definitions of c.pr.’s and c.exp.’s, yield with¬ 
out any difficulty the following theorem where the functions under exp, 
signs are limited to those whose exp.’s exist. 

A. Markov station a rity theorem. To every stationary tr.pr. 
Pt(x y S) there corresponds a stationary regular Markov process Xt> The 
following equivalent relations characterize regular Markov stationarity: 

(i) P{X^ t CS\X ry r^s) ^ P t (X $} S) 

or 

P(X s+tl € ： &,••• ， 足 C ! Xr, r ^ s) 

= P X§ [X ti C *5*1, * * C S n ] 

for all s y all finite index sets and all Borel sets. 
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(ii) P(B 8 \X ry r^s)= P(B 8 \ X s ) = P x -B 

or 

I X ry r ^ j) = I X 9 ) = E x 飞 

for all events B and measurable functions ^ defined on Xt and their trans- 
lates B 8y 

(iii) P x AB t = J P x (ch)P XM B 

or 

E x ^ s = E x (nE x $ 

for all x d all events A and measurablefunctions t\ defined on (X ry r S s) y 
all events B and measurable junctions ( and their translates B 8y ^ 8 . 

Note that, in particular, stationarity implies that 

P(B 9 \ X s = x) = P(B I E(^ 8 \ X 8 = x) = E(^ I X 0 = x). 

A rj. Xt is stationary if its law is invariant under translations in 
time, while the r.f/s of a stationary process have only a stationary law 
of evolution. In order that Xt be stationary it is necessary but, in 
general, not sufficient that the initial distribution be stationary: P 9 = Pq 
for all s y where P 8 is the distribution of X B . However, if the law of evolu¬ 
tion is stationary then this condition is also sufficient, since then, for all 
events B on Xt and their translates B 8y 

PB 8 = Jp 8 (dx)P(B 8 \X 8 = x) = jp 0 (dx)P(B\X 0 « x) = PB. 

Corollary. A Markov r.f. with a stationary tr.pr. is stationary if and 
only if the initial distribution is stationary. 

For stationary tr.pr/s the study of tr. law derivatives becomes as 
follows: The stationary tr.dj. Ft x {y) is defined by Ft x {y) = Pt{x y 
(— oo, y)) (when 9C = /?) and the corresponding stationary tr is 
its ch.f. Upon centering at x y we have 

T t x {y) = F t x (x + y) t 7, ⑷ = e- iux / t x (u). 

The representation of tr. law derivatives, when they exist, reduces to 
the limit ch.f/s (Jh x )、 llh 、as A —> 0 (independent of/), necessarily of the 
form e^ x y \f/ x = (a x y ^ x ) . Note that if exists, then Pni^y 
•v + €) c ) — ► 0 for every 6 > 0 as A —> ()• The corresponding existence 
criterion becomes 
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b. The stationary tr. law derivative = (a x y ^ x ) exists if and only if 

w 二 f - / 馬物0， 

The corresponding theorem becomes 

B. Let e^ x y \f/ x = (a x y ^ x ) y represent the tr、law derivative corresponding 
to a stationary tr.pr. P t (x y S) and letg on Rbe a bounded twice differentiable 
function. Then、as A 一 0, 

II /* Ph ^ Xy 办 ) 《 (’) 一 《 ㈤ .一 aX s'( x ) + J^&( x y y) 

In particular^ if Pt{^y S) is twice differentiable in x then its right derivative 
^4 - 1 

一 —- Pi^Xy *S*) = lim ~ *S*) 一 PtijXy *S*) } exists utidy upoti ttitvoduc- 

dt h—oh 

ing the P. Levy form \f/ x — (a x y (/3 X ) 2 , L x ) y 

— Pt(^y = OL X — Pt{^y S) + 2(0 X ) 2 ~2 户 〆* ^ 
dt dx dx z 

y d 

1 + y 2 dx 

The Kolmogorov continuous case (L x vanishes) becomes then the 
Fokker-Planck original case. 

From now oyi and uyiIcss otherwise stated^ all out Markov processes will 
be regular and stationary on T = [0, oo) so that、whenever convenient, we 
will drop “regular” and ^stationary 

43.4. Strong Markov property. The central problem of random 
analysis is that of the sample functions behavior. As soon as this 
problem arises, random times appear, say, the time of appearance of the 
first discontinuity of sample functions or the time when the r.f. takes a 
given value. In the Markov case, we might expect the Markov property 
to hold under conditionings with a random time r as present, since it 
holds when every one of its values is used as “present.” In fact, for a 


+ f {尸办 + J ， S) - Pti^y S) 


Pi(x y 5 *)} dL x {y). 
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long time this possibility was not even questioned. Yet, let X T be a 
stationary regular Markov rX with tr.pr. 

户办，幻 =VIS dy, x^x 0 , P t (x 0) S) = I(x 0i S). 

Let r be the time X T first reaches %: 不 (co) 〆 外 for / < r (co) and X t (co) 
= Xq for / = r(co). For almost every one of its sample functions, if we 
know that Xt{(^) is in a state 〆 x。at a time /o, then this sample func- 
tion is a Brownian continuous one, and, when it passes through x 0y it 
does not stop there; while if we know that it is in the state x 0 at time / 0 , 
then it stays there forever. It follows that the Markov property of Xt 
is no more true with r as “present •” Thus, the Markov property is to 
be strengthened if we want to be able to investigate the sample functions 
behavior. In other words, we restrict ourselves to those Markov r.f/s, 
always separable, for which,/or 細/&， 

P(B r \X ry r^r) = P x tB ， 

where B C and B r is its translate by a random time r 一 a “ran- 

dom present.” However, the introduction of random “presents” raises 
immediate difficulties. Let, say, r be the first time Xt takes a given 
value x 0 . To begin with, r(co) does not exist for those sample functions 
which never take the value Xq. If r exists it may not be measur¬ 
able; even if it is measurable it may take infinite values t(o>) = 00 , and 
尤⑷ (w) does not exist unless the point at infinity is added to the time 
interval. If r is measurable and finite, X r = d ⑼ （ co )， co C may 
not be measurable; even if all the X r ^ t are measurable, the formal Markov 
property above has still to be given a meaning. Thus, at first we have 
to consider and eliminate these difficulties. 

A random time r will be a nonnegative measurable function, not neces¬ 
sarily finite but not a.s. infinite: PfT > 0, Q r = [r < 00 ]. For random 
times corresponding to sample properties, existence and measurability 
are to be proved. However, if a random time exists only outside some 
event, we take it to be infinite on the exceptional event. This conven¬ 
tion is acceptable as long as we consider sample functions on the given 
time interval [0, 00 ) only. In other words, we limit ourselves to the restricted 
pr. space (Q r , a r , P T ) where a r = (0 r /f, A C Q) and P T = P/PQ r ; this is 
one reason for excluding the possibility of PQ r = 0. Since we seek sample 
properties which, at best, are those of almost all sample functions, that 
is, are valid outside a null event, the exclusion of PQ r = 0 is not a re- 
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striction on random times. Note that whenever relations between pr.’s 
are homogeneous in P T we may and will replace P T by P (upon multiplying 
throughout by a suitable power of PU T ). 

Translates by r of events and, more generally, random variables de¬ 
fined on Xt are defined as in 30.2: The translate of A t = [X t C S] is 
A T ^.t = [X T ^t C S] where 足 +<(w)= 足 ⑼ the translate of X t is 
X r ^t and, in general, the translate of ^ = g(Xr) is = g{X T ^r) where 
X t+ t = / G T). Translates by a random time r are considered 

on only, and we drop “on •” If a random time r is elementary 
r = Z til A , (on fi T ), then, for any r.f. Xt = {Xt^ t ^ 0), the translates 
Xr^-t by r are r.v.’s X r ^t = D 不 , •+</(• so that the translate X t ^.t = 
(Xr^-h / ^ 0) is a r.f. If r is not elementary, then we have to assume or 
to prove .that X x ^t is a r.f. It is a r.f. when Xt is Borelian. In par¬ 
ticular, Xt is Borelian when it is sample right continuous. For then, 
it is Borelian as limit of Borel r.f/s Xt m = CX ] ⑻， / ^ 0) where 

不 ⑻ =£ 不 +Jr^k^rv 

i L2 n ^ 」 

To summarize 

a. The translates of arbitrary rj's by elementary random times are r./. f s 
and so are the translates by arbitrary random times of Borel rj.’s，in par¬ 
ticular^ of sample right continuous rj. 9 s . 

A random time r will be a time of Xt if [t ^ /] C ^ = 0 for 

every /; in other words, if we know what happened up to time / inclusive, 
that is, if we know the sample values X s (co) , s ^ t y then we know 
whether r(o>) ^ / or not. In particular, every “degenerate” time / is 
time of any r.f. and if r is a time of Xt so is every t + /• In fact, since 
the inverse images under a time r of Xt of Borel sets in [0, /] belong to 
(&(X sy s ^ /), we have 

h. If t is a time of Xt so are the random times t + / and，in fact y so are 
the random times g(r) ^ r, where the functions g are Borelian. 

For, by hypothesis, [ 《 (r) ^ /] = t 一 1 他 ，/】 Cl (R(X sy s ^ t). 

A trivial example of a time of Xt is any elementary time r = 22 乂，， 

t\ < ,2 < • • •，where every Aj C (B(^ r , r ^ tj). 

A nontrivial and important example is as follows. Let U be an open 
state set and let tu(co) be the infimum of all / such that the distance of 
the sets (Z s (o)), s ^ t) and U c be zero; if ru(co) does not exist we take it 
to be infinite. If Xt is sample right continuous, then r\j is the time Xt 
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first hits U c y and if Xt is sample continuous, then ry is the time Xt first 
reaches U c . 

c. If Xt is sample continuous {from the right ), then the time Xt first 
reaches (hits) U c is a time of X T . 

For, setting S n = [x: d{x y U c ) < \/n] and letting r < / vary over the 
rationals, 

[ru ^ = [Xt C U c ] U (fl U [Xr e S n )). 

n r <t 

If r is a time of Xr y the events A Cft r such that A[r ^ /] C (& (X sy s ^ /) 
for all / form the <r-field s ^ r) —*in Q r — of events defined on 

(X 8y s ^ r) or on Xt up to time r (inclusive). Since the definitions of 
times r of Xt and of events defined on Xt up to time r are in terms of 
events on it follows that in the Markov case they pertain to all the 
r.f.’s of the Markov process simultaneously, that is, to the process as a 
whole. The same is true for what follows and, therein, a “stationary 
Markov Xt' will mean the process Xt as well as any r.f. belonging to it. 

A time r of a stationary Markov Xt with stationary Bor el tr.pr. 
Pt(x y S) is Markovian or a Markov time of Xt if all X r+i are r.v/s and, 
given X ry the Markov evolution starts anew ： 

P{X T+i CS\X ty r^r) = P(X … €S\X t ) 

with the same Markov law of evolution: 

(TM) P{X T+t 

+T> r ^s) = PtUXr +«，$)，J s ’； 

it may and will happen that X r is to be replaced by X T ^o for these rela¬ 
tions and hence for the following ones but, to simplify, we shall still 
speak of a Markov time r; as usual, we write | • in lieu of | (B(0. 

Upon setting s = 0 in the second relation, the first becomes equivalent 
to the relation 

(SM) P(X T ^ t CS\X n r^r)^ P t (X r) S). 

Thus Markov times r are also characterized by (SM) and (TM). Further¬ 
more, since our Markov evolution is stationary in terms of its degenerate 
Markov times /, it is natural to require that the same be true in terms 
of its random Markov times r: we say that r is a stationary Markov time 
of Xt if it is and remains a Markov time under translations in time, that 
is, if all r + j, j ^ 0, are Markov times of Xt. Thus, stationary Markov 
times r are characterized by (SM) and (TM) with r replaced by any 
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t + j or, equivalently, by 

(SM st ) P(X T+t C ^ I X, r ^ r + ^) = PtUX T ^sy S) y s 

For, (SM) with r + j in lieu of r yields (SM st ) with / + j in lieu of /, 
while (SM st ), with J + / and : f + ? in lieu of t and conditioned by 
(X r ^ 8 ^ry ^ ^ ^)y yields (TM) with t + j and s f in lieu of r and s. . 

A very useful example is that of elementary times: 

d. Elementary times of an arbitrary stationary Markov Xt are stationary 
Markov times of Xt. 

Proof. Let r = (on 0 r ), t x < / 2 〈… •， Aj C ®(Z r , r ^ tj) y 

be an elementary time of Xt^ The integral form of (SM s t) is 

P^AXr^t CS) = f P x (^)P t ^ 3 (X TM ^)y s) y A c ®(Z r , r^T^s). 

^ A 

We have to prove that this relation holds。for the elementary time r, 
namely that 

E CS) = zf 严⑷ ) Pm ( 不出⑼，办 

j i J AAi 

Since AA^ = A[r = tj] C ®(X, r ^ + s) y the ordinary Markov prop¬ 

erty of Xt applies, that is, 

P(X tl + t €S\X ry r + s) = P t ^{X t ^ sy S) y 

and its integral form is the equality between the terms with same j of 
both sums. Therefore, these sums are equal and the assertion is proved. 

Note that it would have sufficed to prove that elementary times of Xt 
are Markovian since their translates are also elementary times, hence 
Markovian. 

If all the times of a stationary Borel Markov Xt with a Borel tr.pr. 
are Markovian, we say that X T is {stationary) strongly Markovian or has 
the strong Markov property. If r is a time of Xt so are all t + j, hence 
all the times of a stationary strongly Markovian Xt are stationary 
Markov times of Xt* Therefore, (SM) holds for all times of Xt or 
(SM s t) holds for all times of Xt if and only if Xt is strongly Markovian. 

If r' varies over all the times of Xt so does r = t^a + °°^i c with / 
varying over all the events defined on (X ry v ^ r f ). For, r is a Borel 
function of r at least equal to r and t = t’ when A — ^ . The integral 
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form of (SM) with t' in lieu of 了 is 



P x A[X r ^ t CS] = I P x {d<S)P t {X r ^M y S) 


and, in terms of r, becomes 


Set 




Pr^y S) - P X [X T C S] 


by analogy with the identity P t (x y S) = P x [X t C S]. Then the above 
relation becomes: for every x 


(SK 8t ) 



Pr^y S) = I P T (x y dy)P t {y y S). 


This equality reduces to the stationary Kolmogorov equation for de¬ 
generate times and will be called {stationary) strong Kolmogorov equation. 
The same arguments as in 43.2 and 43.3 lead to the equivalent semi¬ 
group property 

rrt fr% 

= i ri t 

of the Markov endomorphisms on G with T r defined as T t by 


(T r g)(x) = J" P r {x y dy)g{y) y g 


C G. 


In fact, the arguments in the preceding subsections remain valid when, 
therein, s is replaced by r and, together with what precedes, yield 

A. Strong Markov equivalence theorem. Let Pt{x y S) be a Borel 
stationary tr.pr” where x varies over 9C, S over S, and let s ^ t vary over 
[0, oo) • Let r vary over all the times of Borel r.f. Xt* 

(i) The following equivalent relations characterize the {stationary) strong 
Markov property of Xt with tr.pr. Pt(x y S) 

P{X T+t C S I X ry r ^ r) = Pt(X ry S) 
or 

P x AB r = J P x {do))P XrM B or E x ^ r = E x (riE Xr ^) 

for all events B and measurable functions ^ {whose exp.’s exist) on Xr y for all 
^ C 9C and all events A and measurable ri on (X ry r ^ r). 
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(ii) The following equivalent relations characterize the strong Markov 
property of a stationary Markov Xt with tr.pr, P t (x y S) 

Pr^tixy S) P T (x y dy)P t {y y S) or T r ^ t = T T T t . 

An important example of strong Markov Xt is as follows (Dynkin and 
Ushkevitch), 

Corollary* Let Xt be stationary Markovian with Bore/ tr.pr. If 
Xt is sample right continuous and the Markov endomorphisms T t trans¬ 
form bounded continuous functions into bounded continuous junctions^ 
then Xt is strongly Markovian. 

Proof. Let r be a time of Xt and note that the elementary times 

converge to r from the right (on J2 r ). Let j ^ 0 be arbitrary. Since 
Xt is sample right continuous, it follows by a that it is Borelian so that 
X r ^. 8 are r.v/s, and X tn ^ B —> X r ^ 8 . Therefore, if g C. G is continuous 
then g(X rn ^ 8 ) g(X r ^ 8 ) and 

五 4 E X g(^r^s) or T Tn+s g —> T T ^ 8 g. 

Since, by d, elementary times r n of Xt are Markov times, hence, by A(ii), 
T r n ^tg = T tn Ttg and Tig C C y we have T r+t g = T r T t g, Thus, the family 
of bounded functions ^ on 9C for which this relation holds contains the 
continuous ones. It is closed under passages to the limit by bounded 
sequences. Therefore, by the Baire definition of Borel functions, it 
contains the family G of all bounded Borel functions on 9C and, by A(ii), 
r is a strong Markov time of Xt 、The proof is terminated, 

§ 44. TIME-CONTINUOUS TRANSITION PROBABILITIES 

Let Pt{xy S) be a stationary tr.pr” that is, a Borel function in 
x and a pr. in S with S) = I(x y S) } obeying the Kolmogorov 

equation: 

P 8 ^t( x y = ^ Ps( x } ^y)Pt(yy ^)y S y t 0. 

Unless otherwise stated ， the time arguments r y s y t y with or without affixes y 
vary over [0, oo), x varies over the state space 9C, and S varies over the a-field 
of state sets S generated by the class of open sets in 9C. As usual^ to fix the 
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ideas，we take (9C, S) to be the Bore! line and take the limits in h as 0 < 
h — > 0. 


However, the properties of the tr.pr. to be established and the proofs 
given are valid for all state spaces 9C such that the diagonal (which 
consists of all points (x y x)) in 9C X 9C belong to the product (j-field 
and, consequently, all singletons \x\ belong to S as sections of the 
diagonal. Also they remain valid when the tr.pr. P t (x y S) is only a 
measure in S bounded by 1 in lieu of being a pr. in S. We leave the 
search for corresponding extensions of properties in 44.2 to the reader. 

Denote by P s (*v, S) the derivative of P t (x y S) with respect to / at 
/ = j，provided it exists. Note that the derivative P 0 (^, at / = 0 is 
necessarily a right derivative. Set 


q(x) = lim 
a —♦ o 


1 ~ |^}) 
. , 

h 


q{x y S) = lim 

h 一 ► o 


S) 

' , 
h 




whenever the limits exist, and make the convention that 


q(Xy S) = q{x, SW C ); 

then 


., P h (x y S) - I(x y S) 

PoiXyS) = lim - - - = q(x y S) — q(x)I(x i S). 

h 0 々 

Formal differentiation oi Kolmogorov’s equation with respect to s at 
J = 0 and with respect to / at / = 0 (followed by the change of s into /) 
yields the backward and the forward equations 


(B) 

S) 

(F) 

Pt(x y S) 


[X] 


dy)P t {y y S) - q(x)P t (x y S) 



P t (Xy dy)q{y, S) - I P t (x y dy)q{y) 


s 


Formal solutions (by formal substitution) of these equations are given by 


00 


where 


and 


Pt(x y S) = S) 

n=0 

Pt°{x, S) = e- qMt I(x y S) 


Pt (n ^ l \x y S) = (ds f dy)P t ^ n \y y S) 

Jo J[x\ c 




312 


MARKOV PROCESSES 


[Sec. 44 


or, alternatively, 

Pt {n ^\x y S) = f ds fp 3 (n) (x y dy) f q{y y dz)e- q{z)(t - s \ 

J 0 J JS[x\ c 

In probabilistic language, r t {x y S) is the pr. of transition from x into S 
in time / in finitely many steps (see the probabilistic interpretation of 
q{x) and q(x y S) at the end of the section). Thus, whenever there is a 
possibility of such a transition but not in finitely many steps, it may be 
expected that Pt(^y will be a measure in S smaller than 1 in lieu of a 
pr. in S. 

The foregoing formal discussion brings into light the problems to be 
considered: to begin with, the problem of existence and properties of 
tr.pr. derivatives, hence of q(x) and q(x y S) y and of the corresponding 
sample properties. Since existence of implies the continuity 

condition 

(C) Ph(^y S) —> I(Xy S)y 

this condition will be assumed at the start. This is the Doblin-Doob 
approach to the analysis of Markov evolution. The analytical problem 
of existence, unicity, and tr.pr. properties of solutions of the backward 
and forward equations — the Kolmogorov-Feller approach — will be left 
out for it fits within the wider and more powerful Hille-Yosida-Feller- 
Dynkin semi-group approach. 

44*1. Differentiation of tr.pr^s. The Doblin-Doob results under con¬ 
dition (C) were improved by Kolmogorov who established the existence 
and finiteness of the function q(x y S) for countable state spaces and his re¬ 
sult was extended by Kendall to noncountable state spaces of the gen¬ 
eral type described above, under a parallel restriction of‘V-uniformity •” 

Note that condition (C) is equivalent to Ph(^y |^}) lor, on account 
of lemma c below, to the continuity in / of the tr.pr. 

a. For every section D x of D §> X §> and in particular for sections {夂} 
of the diagonal、the function P t {x, D x ) is Borelian in x. 

Proof• The class of sets D for which the assertion holds is closed 
under finite summations and monotone passages to the limit by se¬ 
quences. It contains all rectangles 6 1 X C S X S since the function 
P t {x y (S X S^x) = I(Xy 4?)/^(尤， y) has the asserted property. There¬ 
fore, it contains the minimal field S X S generated by the rectangles. 

b. The function Pt(^ y {x}) is supermultiplicative in t. 
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For, by Kolmogorov’s equation, 

Ps^tixy {^}) ^ I P 3 (x y dy)P t (j y {^}) = P s ( Xy [x\)P t (x y &})• 

J(xi 

c. Under (C), the function g{t) = - log P t (x y {x}) exists and is finite 
⑽ d subadditive in /, and the function Pt{x y S) is uniformly continuous in t 
uniformly in S. 

Proof. Let P h (x y |^}) 1. Then, by b, P t {x, |^}) ^ P t in n (^y |^j) 

> 0 for « sufficiently large. Therefore, g(t) exists and is finite, and super- 
multiplicativity in b becomes subadditivity of g(t). The first assertion 
is proved. By Kolmogorov’s equation 

△ = S) — Pt(x y S) 

• = f Ph(x y dy)P t (y ， 幻一 (1 一 P k (x ， [x\)P t (x y S) 

J[x\ c 

so that 


-(1 - P h (x y 1^}) ^ P h (x y l^r) = 1 - P h (x ， {x}). 


Therefore, 


Ps(x y S) - P t (x y S) 


- P\s-t\(^y |^}) —► 0 


uniformly in / and in ^ as j > 0 , and the second assertion is proved, 

d. Point differentiation lemma. Under (C), 


- Phi^y |>V}) 
h 


q(x) ^ oo 


where the function q{x) is Borelian y and Pt(x y {^}) ^ 

Proof. If q{x) exists, then it is Borelian as limit of sequences of Borel 
functions corresponding to A = h n 0. 

Fix x y set g{t) = — log P 人 x ， {^}) and note that, by c, 

0 ^ g{t) < oo, g(s + /) ^ g(s) + g(f) y ^(+0) = g(0) = 0. 

Given / > 0 and h > 0 y take n = [t/h\ so that t = nh + 6，QSd <h 
and, by subadditivity, 


汾 ) 




t g(e) g(h) nh 






妳 ) 


h 
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Therefore, g{t)/t ^ \\m\nfg(h)/h and 

h —► o 


limsup^ ^ sup— ^ liminf^ 
h — Q h t>0 t A—►() h 


Thus, q{x) = lim— exists with 


h 一 ► o Ji 


q{x) = sup 


g(0 


t>o t 


The second assertion follows by 

Pti^y {*V})= 


g ⑷ 


e 


i e 


■Q(x)t 


and the first one follows by 
1 — Ph(x, {x}) 1 


一 e _sW 


h 


h 


(1 + o(l)) 


gw 

~h 




The proof is terminated. 

We say that a state x is absorbing 、 instantaneous^ or steady according 
as q{x) = 0, q{x) = oo，or 0 < q{x) < We say that a set U = 
[x ： q{x) ^ c] with c finite is q-bounded. Since the function q{x) is Borelian ， 
^-bounded sets are state sets and, since 1 — Ph(x ， {*v}) S 1 — e~ ch 0 
uniformly in x C. the continuity condition PkOc ， {*v}) —^ 1 holds 
uniformly on every ^-bounded set. The same is true on every finite 
state set even if it has instantaneous states. In general, let be the 
class of all uniform continuity state sets —on each of which (C) holds 
uniformly. Clearly is closed under taking finite unions and state 
subsets of its sets and we shall use these closure properties without 
further comment. Unless otherwise stated, we denote the sets of ^ by U y 
with or without affixes. 

e. Under (C),/or every x and every uniform continuity state set U 


Pk(x y U) 
~~ h ~~ 




finite bounded by q{x\ Borelian in x and a measure in U. 

Proof. If the function q(x y U) exists, then it is Borelian in x as limit 
of sequences of Borel functions corresponding to h = h n 0 and it is 
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bounded by q{x) since 



1 一 Ph(x y {^}) p h (x y U) 
~ \ — 


q{x) — q{x y U). 


Let ^ C ^ be such that t ； + {x} C V y (say, 〆 =[/+ \ x }). Given a 
positive € < ^ there exists a positive / 0 = / 0 (^, e) such that, for 
{ = ^o> 户心，〉 1 — € for all ^ C ^ hence Pt{x^ U) < e. Set 

P f h(x y S) = Phi^y S)y P\j^i)h(Xy ^) = f P f jh{x y dy)Ph{y y S). 

J u c ^ 

In probabilistic language, P\j^i)h(x y S) is the pr. of transition from ^ 
into <9 in time (j + 1)A avoiding U at times A ， … 、 jh. From this 
probabilistic interpretation, or directly by induction, it follows that 

A ： — 1 广 

⑴ PkhOc ， 幻 =E I P f jh(x y dy)P (k _ j)h (y y S) + F kh (x y S). 

；=i J u 

Given h and / ^ / 0 , let n = [t/h] so that nh = t - d ^ 0 ^ 6 < h. 

Since P f kh{^y S) ^ " 如， S) and, by (1) with S = U y 


n 


Pnh(x y 17)=1 ： F kh (x y dy)P in _ k)h (y y U) y 

k=l J U 


it follows that 


Pnh(x y U) ^ EPWmCv ， W) f Ph(x y dy)P in _ k)h (y y U) 

k = \ Ju 


hence 


( 2 ) 


n 


Pnh(x y U) ^ZP\k-l)h(Xy W)0 — ^)Ph(x y U). 

fc = l 


It also follows that 


n 


6 > P nh {x y t/) ^ (1 ^ 6) I ： F kh {x y U) 

k = l 


hence 


(3) 


n 


6 


L P f kk{x y u) ^ - 

A: = l 1 一 € 


On account of (3) and (1) with S = {x\ and k S n 、 
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-€ < P kh (x y {x}) ^ z P^hix, U) + P f kh(x y {^}) 


+ P'kh(x y {^}) 




hence 


(4) 


P f kh{x y |^j) ^ 




Similarly, on account of (2) and (4 )， 


Pnhix, U)^n (1 - ^Phix, U) 


Ph{x, U) 


Pnh(^y U) 


nh 


Therefore, on account of c, letting A —> 0 then 


limsup - 
h 一 ► 0 h 1 


一 liminf 
3c t—*o 


Pt(x y U) 


and letting e 


咖⑺ = rim 

h 一 o h 


exists with 

(5) q(x y U) ^ 


Pt(x y U) 


^ h(K {^} av. 


It follows that q(x y t7) < oo is a measure in U since it is clearly finitely 
additive and, as U n | 0, for / 0 = /o(^i + \ x \y c )> 


g(x y U n ) ^ 


Pto(^ U n ) 


The proof is terminated. 

P h {x y U) , ”、 

Note that given x and V C. for x C U d V y -- > U) 

h 

uniformly in U d V. For, if h ^ t^{V U {x} y e) then, by (5), 


Pk(x y U) 


- q(x y U) ^ -3eq(x y U) ^ V), U C V, 
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and 

Pk(x y U) 


U) 


Pk(x ， F) 


V) + g(x y U)- 


Pk(x y V -U) 


Pk(x y V) 


V) + 3eq(x y V), 


so that 


sup 

UdV 


Pk(x y U) 


q(x y U) ^ 3eq(x y V) 


Pk(x y V) 




What precedes, together with our convention q(x y S) = q(x y 
yields 

A. Tr.pr/s differentiation theorem. Under the continuity condi- 
tion y the derivative Po(^> II 、 at t = 0 of the tr.pr. Pt{x y U) exists for every 
state and every uniform continuity state set U. In fact、for every state x 
and every uniform continuity state set V y 


P h (x,U) - I(x ， U) 


g(x y U) - q (x)I(x y U) = P 0 (x y U) 


uniformly in U C V; the nonnegative junction q{x) ^ oo is Bore Han; the 
function q(x y U) = q(x y U{x} c ) < oo bounded by q{x) is Borelian in x and 
a finite measure in U y with 


g(x y U) ^ 


Pt(x y U{x} c ) 


Jot ull U {^cj Cl V utid / ^ €)，0 < € < 去 * 

If the function q{x) is bounded then 9C is ^-bounded and the continuity 
condition is uniform (for all x C 9C)* Conversely, if the continuity con¬ 
dition is uniform, that is, 9C C "U then, by A, q(x) = q(x y 9C) ^ 
1/(1 — 3e)/o for to = =/ 0 (9C，€) and the function q{x) is bounded. Thus 

Corollary. The function q{x) is bounded if and only if the continuity 
condition is uniform. Then the tr.prJs differentiation theorem is valid with 
[7 C ^ replaced by S and q(x y 9C) = 

If the function q{x) is finite, hence 9C = [x: q(x) < oo], then there 
exists a countable partition of 9C into uniform continuity state sets Uj 




318 


MARKOV PROCESSES 


[Sec. 44] 


(say, Uj = [x :7 — 1 ^ q(x) < j]). In general, whenever there exists 
such a countable partition 9C = XI we say that the continuity con¬ 
dition is (j-uniform. For example, if the set of instantaneous states is 
countable [x: q(x) = oo] = (x\ y X2 y •••)， then the continuity condition is 

d-uniform; it suffices to take Uj = [x: j — 1 ^ q(x) < ;] + In 

particular, if the state space is countable then the continuity condition 
is (j-uniform. Thus, we may consider <r-uniformity as a “natural” trans¬ 
position of this property of countable state spaces to general state 
spaces. 

Let q{x) = sup q(x y U) and note that there is always a sequence 

uc% 

V n \ V ^ U V n and contained in Ti such that q{x y V n ) t 空 Cv). 

f. Extension lemma. If q{x) < oo then、for this the measure 
q(x y U) in U extends to a measure q(x y S) in S y and the extension is finite 
with q(x y 9C) = q(x). 

If the continuity condition is <j-unijorm then、for every x y the measure 
q(x y U) in U extends to a measure q(x y S) in S y and the extension is unique. 

Proof. We use, without further comment the closure properties of 
and the fact that a nondecreasing sequence of measures Mn T M on S 
converges to a measure m on S; it suffices to note that, as w —^ oo then 

Ytl ■■_ imi > 00 ^ 

^ Mnd S;_) 4 yd 心)， 

while m 

m(S ^i) = Mn(S S m(^;) 

Since 

Vn) T q(x) < 忒、 V n \V, VnC^ 
it follows, by A, that 

q(x y UV C ) = q(x y UV C + Vn) - V n ) ^ q(x) - q(x y V n ) ^ 0 
and 

q(x y U) = q(x y UV) g(x y UV C ) = q(x y UV) = lim q{x y UVn)* 

n 

Therefore, the measure q{x y S) in S defined by 

q(x y SV n ) T q(x y S) = q(x y SV) 

Is an extension of the measure q{x y U) in U y with 

q(Xy 9C) = q(x y V) = q(x) < oo. 

The first assertion is proved. 
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If the continuity condition is cr-uniform, then 9C = [ U ny or V n = 

n 

Z ^ T 9C, C OL. Therefore, q(x y UV n ) | q{x y U) and the extension 

of the measure U) in t/ to a measure q{x y S) in S is determined by 
the necessary condition q(x y SV n )]q{x y S). The second assertion is 
proved, and the proof is terminated. 


Extended tr.pr/s differentiation theorem. Under the con- 
titiuity condition、if q{x) = < oo under the <j-unijovm continuity 

condition、if q{x, 9C) = q(x) < oo) then, for this x, the derivative P 0 (x y S) 
exists for every S. In fact 、 


— I(x y S) 


— S) — q(x)I(x y S) = P 0 (Xy S) 


uniformly in S and q(x y S) is a finite measure in S, 

Proof. It suffices to prove the assertion for x S so that the term 
in q(x) disappears. According to the hypothesis and the extension 
lemma, the measure q{x y U) in U extends to a measure q(x y S) \r\ S with 

q(Xy 9C) = q{x) = q{x) < oo 
and 

咖， SV n ) T q{x, S) = q {x y SV), V n \V, 

It follows that 


Ph(^y S) 
h 






Pkjx, SV n ) 


q{x,SV n ) 


Pk(x, SK C ) 
h 


— 《 ( 夂 ， A V n C ) 


0 


uniformly in S. For, by the tr.pr.’s differentiation theorem, for n fixed, 
the first term on the right converges to zero uniformly in SV n hence 
in <9 as A —> 0, and the upper bound of the second term below 


Pk(x y K c {x} c ) 
h 


+ q(x y F n c ) 


- P h (Xy {x}) Ph(X ， Vn{x\ C ) 


h 


h 


+ ? (方， ^ n ) 


contains no and converges to q{x) — q(x y V n ) then to zero as A — 0 
then « — > oo. 
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What precedes is valid with ^ = 9C under the <r-uniform continuity 
condition and q(x y 9C) = The proof is terminated. 

Corollary. If the two junctions q(-) and 夕 (•) coincide and there are 

_ • 

no instantaneous states^ then the derivative Pt{x y S) exists and is finite and 
continuous in t ^ 0 for every x and every S y and the backward equation 
holds: 


Pt(x y S) 



dy)Pt{y, S) - q(x)P t (x y S). 


For, by Kolmogorov’s equation, 


Pt^hi^y S) — P t (Xy S) 



[x] c h 


Ph(x y dy)P t {y y S) 


Ph(x y 1 ^}) 






q(x y dy)P t (y y S) - q(x)P t (x y S) 


upon using the following propositions: 

g. If finite measures ii n on S converge to a finite measure ji on $ and g on 

9C is a bounded Bore!function，then ^f g dfi. 

It suffices to note that g can be approximated up to any given € > 0 by 

m 

simple functions g f ^ x j^Sj uniformly bounded by some finite con- 

j=i 

stant c so that, as w then € —> 0 ， 

\g - + 如 

H~ I — S \ = €/x(9C) H~ € ^2 I - + €/X n (9C) > 0. 

J j=l 

According to this proposition, the foregoing passage to the limit as 
A •—> 0 is valid. Furthermore 

If for t ^ 0 y thefunction g{t) is continuous and its right derivative 《+(/) 
exists and is continuous^ then the derivative exists {and coincides with the 
right derivative) • 


For, setting h{t) = f g^(s) ds so that h{t) = i+(/)，the assertion re- 
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duces to the classical one that a continuous function g(t) — h{t) whose 
right derivative vanishes is a constant, hence has a vanishing derivative. 

Since after the foregoing passage to the limit as h —>0, the existence 
of a right derivative P t ^(x y S) equal to a continuous function in / is 
established, the term ‘‘right’’ may be omitted on account of the proposi¬ 
tion just established. The proof of the corollary is completed. 

44.2. Sample functions behavior. Let Xt = (X ty / ^ 0) be a separa¬ 
ble r.f. and let (R(Xt) be the cr-field of events defined on Xt- We denote 
by X s ^t = ( 足 +“ ^ ^ 0) the translate by s of Xt- Let 《 be a numerical 
Borel function on the sample space of Xt such that the exp. of g(Xr) 
exists. The translate by s of g(Xr) is 足 ( 足 +r) and B s — the translate 
by s of an event B C (8(^) 一 is defined by Ib 9 = (Ib) 9 - We denote by 
P t the distribution of X t t Pt(S) = P[Xt C S], 

Throughout this subsection^ we assume that Xt is a stationary regular 
Markov r.f. with tr.pr. P t (x y S) y unless otherwise stated. To be precise: 
there exists a family (P x y ^ C 9C) of pr.’s on (R(Xt) and regular versions 
of c.pr.’s and c.exp.'s below — the only ones we shall use — such that, for 
every x C 9C, <9 C S> 5 C 0 ^ j ^ the Markov property 

holds: 

(M) P(B 8 \X n r^s) ^ P(B S I X 9 ) 

and is stationary: 

(S) P(B 8 \X 8 = x) = P(B \X 0 = x) = P X B 

with tr.pr. P t (x, S) : 

(Tr.) P(^t €1 ^ I -^o = = P x \Xt C! = ^)• 

Upon denoting by E x the exp. which corresponds to P x and approximat¬ 
ing Borel functions by simple Borel ones, it follows that 

E(g(X^ T ) \X ry r^s) = E(g(X^ T ) I X 9 ) y 
E(g(X 8 ^T) I 足 = x) = 

Note that upon conditioning by X Hli • • •, X sy S\ ^ • ••刍 we may re¬ 
place in what precedes X ry r ^ s y by X Hiy …， X 8 . 

fVe also assume that、unless otherwise stated、the continuity condition holds: 
(C) /(x*, S)y 

equivalently, Ph(^y W) — 1 or，by 39.1c, Pt(Xy S) is uniformly con¬ 
tinuous in / uniformly in S. 
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a. Xt is continuous in pr. y in fact、for every x, P x [X t ^ h ^ X t ] 0/or 
every / ^ 0 and P x [Xt^h ^ ^t\ ~~ y ^ f or every / 〉 0. 

Proof. The first assertion follows from the second one since, by the 
dominated convergence theorem, 

P[\ X t±h - X t \^e] 

=EP(\ X t± h - X t \ ^ e \ X 0 ) ^ EP(X t±h 5^ X t I Xo) 0. 

Since, by Markov property, 

P(X t+ h ^ Xt I Xo) = E{P(X t+ h ^ Xt I x ti Xq) I Xo] 

= EiP(X t+h ^ X t \Xt)\Xo\ y 


it follows that, for Xq = x y 

P x [X t ^h ^ Xt) - E x P(X t ^h ^ X t I X t ), 

and, by stationarity, 

P(X t+ h ^ Xt \ Xt = y) = P(Xh ^ X 0 \ Xo = y) = Ph(y f W c ). 

Therefore, by the continuity condition which says that Ph(y ， {y\ ) 
and by the dominated convergence theorem, 


p x [ Xt^ h 〆不】 =/ 阶， 々) p 心， W c ) — a 

Similarly, replacing / by / - A，by the continuity condition and its im¬ 
plication S) Pt(^i S)y 


^[Xt 9 ^ Xt^h] = f Pt-h(^y dy)Ph{y, {0—0 ， 

and the assertion is proved. , r n 

The last passage to the limit is based upon the following proposition. 


Jgn^n 一 
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For, by Egorov’s theorem, given € > 0 there exists S with fiS c < e and 
s n = sup j g n (x) — 畧 ( 尤） | 一 0 and, by 39.lg, as ” — oo then € — 0, 

xCS " 


Jgn d[L n -Jg 


dii ^ 


^g{d^ n - d[i) 


S nMn$ + 2( finS 


c 


Note that the assumptions “measure” /x and “Borel function” 欠 may be 
omitted, for they follow from the convergence assumptions. 

c. Duration of stay lemma. The pr. that starting from x at time s, 
Xt stays in x during time t is given by 

P{X^ r = A 0 $ r ^ / I 足 = 4 = e ，、 

Proof. Since Xt is separable and continuous in pr” we can replace 
[0, oo) by a countable set S dense in [0, oo), say, the set of dyadic num¬ 
bers jh ny h n = ( 士 ) ' Thus, the sets 

[X 8 ^ r = x y 0 < r < /] = [X 8 ^ r = x y 0 < r < /, r C 

are events. Because of stationarity, it suffices to prove the asserted re¬ 
lation for j = 0. 

Let> 々 n = [t/h n ] so that k n h n — t and, to simplify the writing, drop 
the subscripts n so that h = h n — 0 and 走 =— oo as w By 

Markov property and continuity condition (P(X t = x | A、" = x) l). 

ph = TL P{Xjh = ^ I X(j^Dh = *v) — P(X S = x y 0 S s ^ l \ X 0 = x) 

j 蠡 k 

and, by stationarity and 44.1 d (see its proof), 


ph = Phi^y {^}) = exp 


( log P h (x y \^\) kh 

[ h 


—exp { —q{x)t). 


The proof is terminated. 

Upon introducing the duration of stay r(w) of Xt{^) in some state and 
considering the three cases q(x) = 0, q(x) = oo, and 0 < q{x) < <»， c 
yields 

A. Duration of stay theorem. The duration of stay r is a random 
time、not necessarily finite、with P x [r > /] = e — q ( x )、 

In particular、outside P x ^null events、when at some fixed time Xt takes 
the value x，then it stays in x forever when x is absorbing、or leaves it at once 
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when x is instantaneous y or stays in it for some {random) time and then 
leaves it when x is steady 、 

The last statement explains the classification of states x into absorb¬ 
ing, instantaneous, or steady, according as q(x) = 0, q(x) = oo, or 
0 < q{x) < oo. 

The sets B n = [X s ^t ^ X sy 0 S i < l/«] T 厶 =U 爪 are events 
since Xt is separable, and so is their limit B 一 the set to which cor¬ 
respond the sample functions remaining constant for some positive time 
after s. By A and the dominated convergence theorem, 

PB — PB n = Jp 3 (^x)e^ (x)/n 



[?(•) <*] 


P 人 dx 、 r_ n — P 8 [q(-) < 


Similarly, C n = [X 8+i ^ X 8 for some t ^ n)] C—the set to which cor- 
respond sample functions having a discontinuity some time after s y and 


PC ^ PC n =JWx)(l - e^ (x)n ) 


In particular 



[«(*)>o] 


Ps(^)(l - 广 (X)n )— 


Psl 9 0)>0]. 


Corollary. If there are no instantaneous states then、after uny given 
time s y almost all sample functions are constant for some positive times y 

finite or infinite• 

JJ there are no absorbing states then、after any given time s almost all 
sample functions have u discontinuity at some finite time y positive or not. 

If there are only steady states then，after any given time s, almost all 
sample functions are constant for some finite positive time and then have a 

discontinuity. 

At first sight, if there are no instantaneous states, then we expect al¬ 
most all sample functions to stay constant for some positive times, then 
jump and remain constant for some positive times, and so on, unless they 
get into some absorbing state and then stay there forever. Thus, wc 
expect that almost all the sample functions will have a finite or infinite 
sequence of isolated jumps ，that is, preceded and followed by time in¬ 
tervals of constancy. Yet，more complicated discontinuities may occur 
unless some restrictions are imposed. To begin, we shall study those 
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sample functions Ar(co) whose first discontinuity time r after any given s 
is of the type: Xt{^) is constant up to a finite positive time t(oj )， is dis¬ 
continuous at t(w )， and is constant with a different value after r(co) for 
some time h(o)) > Q. Because of the separability implications 38.2(3°), 
these simple discontinuities are isolated jumps, provided we neglect a 
null set of sample functions. From now on, we assume q(.) finite and 
we complete our measures so that subsets of null sets are null. 

Fix / > 0, a steady state x 1 , and a uniform continuity state set U^B x. 

kt 

Let D ni h be the set of w's such that X r {oi) = x (orO ^ r ^ — and X r {iS) 


c (k + 1 )/ ^ ^(k+ 1 )/ 

y for - ^ r ^ - 

* n n 


h y for some k < n and some y U. 


According to A, 


l 


P x D nth = 


'q(x)ktl 




q{x) tjn 




(1 —r_" n )/(//” ） Jut/n 


Ptm(x y dy)e^ )h . 


Therefore, by 44.1 A and 44. lg, as n 




P x D n , 


Ph 


qi^) 



q{x y dy)e^ {v)h y 


then, as h 


Ph 


，— g 


(X”) 


U) 

. . . . • 


Let Dh = liminf D n ,h，Dh = limsup D n ,h and note 

_ n n 

U Di/nyDh ] D — \J Di/n as A 0. Then, by the 

n 

theorem, from P x Dh S ph S P x Dh y it follows that 


that Dh] D = 
Fatou-Lebesgue 


P^D Sp S P X D. 


Let Dh be the set of w's such that X r (o)) = x for 0 ^ r < r(w) < / and 
X r (o)) = y for r(co) < r < r(co) + h for some y U. Then D 、 个 D as 
A — 0 and D is the set of all such w with some h = h{o)) > 0. Thus, D 
corresponds to the set of those sample functions which have an isolated 
jump from x into U at some time less than /• According to the above 
definitions, if w C Da then co C D n ,h for finitely many values of n only, 
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hence w ^ Dh. Thus, Dh Cl Dh and 15 Cl D. Similarly, ifo) C ^ then, for 
n > 2//A sufficiently large, there exists a. k < n such that — ^ t(w) < 


(是 + 1)/ 


n 


and X r (o)) 




n 


n 


n 

t 

h - for some 

n 


y U y hence w C D n ,A / 2 for n sufficiently large. Thus, Dh Cl D n , hl2 
and D CL D. It follows from D C. D that D = D = D. Therefore, 
— p since P x D^p^ M3: - 

The preceding discussion remains valid when U is replaced by S and 
q(x y S) = lim Ph(^y S)/h exists for all S ^ x; by 44.IB, it is so when 

h —> 0 

q(x) = sup q(x y U) = q{x) since q{x) < Thus, when q (•) is finite 
U€% 


d. The sample functions starting from a steady state x at time s、which 
remain constant for some positive times less than t then jump into a uniform 
continuity state set U^B x and remain constant for some times、correspond 
to a set D with P X D = (1 — e^ q(<x) l )q{x y U)/q(x). //, moreover^ q(x )= 
q{x) then what precedes is valid with S in lieu of U. 


We recall that S denotes any state set while U denotes only the uni¬ 
form continuity ones. 

We make the following convention: q(x y S)/q(x) = 0 when q(x) — 0. 

B. Isolated jumps theorem. Let Xt be in a state x at time s and q 
be finite. Then 

The pr. that there be a sample discontinuity in the finite or infinite in¬ 
terval (s y s + u) and the first one be an isolated jump into U x is given by 
(1 一 e^ q( ^ x)u )g(x y U)/q{x). If there is a sample discontinuity in (s y s + u) y 
{and when x is steady 、 a.s. there is at least one after s ), then the pr. that the 
first one be an isolated jump into U is given by q(x y U)/q(x). 

I/ y moreover 、 q(x) = q(x) then what precedes holds with S in lieu of U y 
andy when x is steady 、 a.s. there is a first discontinuity which is an isolated 
jump. 

Proof. The first assertion and the one with S in lieu of U replaces 
d together with the convention about absorbing states, and the second 
assertion follows by A. The assertion about an isolated jump without 
specifying into which set means that the pr. of an isolated jump from x 
into [x\ c is one and results from q(x y {x 1 } 0 ) — q{x). The proof is ter¬ 
minated. 


At first sight, once isolated jumps occur, the same stationary Markov 
evolution starts anew. However, this means that we can use the random 
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time r, to be precise r + 0, as a present or past moment in the relations 
(M), (S), and (Tr.) at the beginning of this subsection. Since these re¬ 
lations pertain to all states and all state sets, we shall have to assume 
not only that there are no instantaneous states but also that the two 
functions q{x) and q{x) coincide. It will suffice to prove that t + 0 is a 
stationary Markov time of Xt 、 that is, the relation 

(SM st ) P(X T ^ t CS\X ry r^r + s) = P t ^(X T ^ sy S), O^s <t<<^ 

has meaning and is valid. Thus, we shall have first to show that all 
Xr^ty / ^ 0 are r.v.'s for r finite, that is, on f2 T = [r < «】• The condi¬ 
tioning by (X ry r ^ r + ^) means then conditioning by the a-field in U T 
of all events A CL ^ such that A[r ^/] C ® ( 不 ， r ^ / + j) The proof 
will be based upon 43.4d—*the only result we require in Section 43. 

e. Isolated jump time lemma. Let the functions q{x) and q{x) co¬ 
incide and be finite. Then the first isolated jump time r (r + 0 to be 
precise) is a stationary Markov time of Xt- 

Proof. Assume that there are only steady states so that, by A, r is 
a r.v. with P x [r > /] = e 一 By its definition, r is a time of Xt 、that 
is，[r ^ /] C (& (X ry r ^ /). For, if we know a sample function d(w )， 
r ^ /) up to time / inclusive, then we know whether it left the state Xo(o)) 
or not during this time interval. 

To prove the assertion, we subdivide [0, oo) into intervals of length 
h n = (§) n ，denote by r n (w) the first of the subdivision points which 
follows r(w), approximate functions of r by functions of r n and let 
n <x>. The following immediate properties will be used without 
further comment: t < Tn S t + A n ， T n 一 r + 0 ， [r n + / — h ny r n + 
/U [r + /}, and, knowledge of r(w) implying that of r n (w), r n is an 
elementary time of Xt hence, by 38.4|d, is a stationary Markov time of 
Xt. The property to be established and to play a central role is that, 
for every t + /， / 2 0, almost all sample functions Xt{o)) have a time 
interval of constancy at r(a)) + / that is, on [r(a)) + /, r(w) + / + 々 (w )]， 

A(w) > 0. • , • 

The constancy property at t + 0 is immediate. For, by definition of 
an isolated jump almost all sample functions have a time interval of 
constancy from r + 0. Therefore, the limit 不 ( w )+o( w ) = ^Vn( w )( w ) ^ r ° m 
some n = ”(w) on exists, and 不 +。is a r.v. 

Given / > 0, we take n sufficiently large so that / 一 > O. The 
event B n = [X Tn ^ r = 2, r C [,— 》 n，,]，2 C corresponds to the set of 
those sample functions Xt(s^) which are constant in 5 during the time 
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interval [r n + / - A„ ， r n + /]. Thus, 

P( 5 n \ X Tn = y) = f Pi-h n (y ， dz)e^ q{z)h n ^ P t {y y S) 

J s 

and, moreover, P(B n \ X Tn (cS)) = P(B n \ X rM ^ 0 (^)) from some n = n(o)) 
on, because of the time interval of constancy at r + 0. Since B n I B = 
fl B ny it follows that 


PG — PB n =J^P(B n I ^£ r T ( a ,) +0 (o)))P(^/w) + 0(1) —> J 9 p r ^o(dy)P t (y ) S). 

Similarly, if the event C n corresponds to the set of all sample functions 
Xt{^>) which at time + / — /i n are either in S or in some state 

2 C and leave ^ within time h ny then 

P(Cn I 

= Uy ， S) + f P^ hn {y y dz){\ - ei ⑽ n) — P t {y y S) 

Js c 


and, setting C = liminf C ny 

PC ^ liminfPC, = lim PC n = PB. 

Since B n Cl \X T ^t C Cl C n so that B d [X T ^.t €1 S] d C y it follows 
that [Xj^t C S\ differs from 5 by a null event; furthermore, PB = 1 
when S = X. Thus, Xj^t is a r.v” almost all sample functions 
have a time interval of constancy at t + /， and P(B n \ J^ Tn ( w) (w))— 
P{X T ^t C S I X t ( w )^ 0 (oj)), Therefore, there is a regular version of 
c.pr. P(X T j^ t C S I X t ^q) = P t (X T ^oy This result is valid for every 
X T j^sy s < t y and what precedes applies with X T ^ S in lieu of X t ^q ； in 
particular, we can take P(X T ^ t C ^ | X T ^ S ) = P/^(X T+I „ S). 

Since r n is a stationary Markov time of Xt and A C ® ( 不 ， r ^ r + j) 
C r ^ r n + j), it follows that 

PAXr^t CM — PAB n 


P(B n \ X Tn d dP — P(X T ^t C ^ I X T ^ s )dP. 

A J A 


Thus, 

PA[Xr^t CS) = f Pt^{X T ^ y S)dP y AC ®(X, r Sr+ s), 

^ A 

that is, the integral form of (SM st ), hence (SM, t ), are valid. So far, we 
assumed that there were only steady states so that fi 7 = [r < oo] was an 
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a.s. event. If there are also absorbing states and Pfi 7 > 0, then what 
precedes applies upon replacing all events by their intersections with 12' 
If PQ, 7 = 0, then what precedes has no content but almost all sample 
functions remain constant forever from time t + 0 on, and we may con¬ 
sider this property as the trivial degenerate form of the proposition. 
The proof is terminated. 

A q-pair of functions q(x) y q(x y S) will be called regular if they are finite ， 

nonnegative，Borelian in *v，andisa measure inSwith 《 (A ：， {x}) = 0 

and q(Xy 0C) = q(x). We say that such a pair is bounded if the functions 
are bounded. We say that such a pair derives from a tr.pr. Pt(x y S) if 
the function P o(^, S) = q(x y S) — q{x)I{x y S). In fact, then, by the 
corollary of 44.IB, the derivative S) exists and is continuous in /, 
the backward equation holds, and the tr.pr. obeys the (r-uniform con¬ 
tinuity condition; if, moreover, the 夕 -pair is bounded then, by the 
corollary of 44.1 A, the tr.pr. obeys the uniform continuity condition. 

C. Sample step functions theorem. Let a q-pair of functions q{x)^ 
q(Xy S) be regular. 

If the q-pair derives from the tr.pr. P t {x, S) of a separable stationary 
Markov rj. Xt = (Xt y / ^ 0), then there exists a random time re of ac¬ 
cumulation of isolated jumps、not necessarily infinite、and almost all sample 
functions Xt{^) are step functions in [0, r^(co)). If y moreover, the q-pair 
is bounded、then almost all sample functions are step functions. 

Conversely^ the q-pair derives from at least one tr.pr. P t {x y S) of a sep¬ 
arable stationary Markov r.f. Xt = {Xt y / ^ 0) with a corresponding ran¬ 
dom time tq. Ify moreover^ re is a.s. infinite、in particular，if the q-pair is 
bounded、then the tr.pr. is unique. 

Proof. We use without further comment the isolated jumps theorem 
and the strong Markov jump time lemma. 

I 0 If there are no absorbing states (that is, if the ^-functions are 
positive), then there is a sequence of finite positive random times 
ri, T 2 , ••- such that almost all sample functions Xt{^) are constant on 
[0, ti(w)) ， On( 60 ) ， ri(co) + T 2 (^))> • "，with different values in any two con¬ 
secutive intervals; we set ro(w) +0 = 0. If there are absorbing states, 
then, whenever is in such a state for the first time — at some 

T n —i(w) + 0， then it stays there forever so that r n (w) = oo and we set 

Tn+l( w ) = T n +2( w ) = .. • = oo. 

In either case, the sum of the series E r n (o;) of positive terms 

exists and is finite or infinite. If t^(oj) = oo, then the sample function is 
a step function. If r^(w) < oo, then we know only that is a step 
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function on [0, r^(w)). Thus, the sample functions which correspond to 
the set [re = °°] are step functions. 

In particular, if the 夕 - pair is bounded by f < oo, then P[tq == oo] = 1 
so that almost all sample functions are step functions. For, then 


P[rn ^/] =J Pr n ^oWP X [rn ^ 〆 


for every n and every / > 0, so that P(limsup [r n ^ /]) ^ e 一 ct ，henCe in¬ 
finitely many r n are ^ / with pr. ^ e~ ci and, thus, P[re = oo] ^ e~ ct —^ 
1 as / —> 0. The direct assertions are proved. 

2° Conversely, given a regular 夕 - pair, we construct a separable sta¬ 
tionary Markov r.f. Xt with a tr.pr. from which the 夕 -pair derives, upon 
following the pattern set by what precedes. 

We select r 0 = 0 ，专 0 = X 0y r\ y = X Tl> • • as follows: the r.v. 专 0 is 
chosen arbitrarily and for n > 0 y given the preceding choices, we choose 
rn+i and 专 n+ i so that 

P(r n ^i ^ / I r 0) •••，、，&)= ei ❹ 

尸 ( 专 n + l C ^ T 0 , 专 0 ， … ， T n ， 专 n ， T n + i) = q(^ny ^)/^(?n) 
and, whenever g(^ n (^)) = 0, we take t„ +1 (oj) = T n+ 2 (w) =•••=«)， 

- n 

专 n+i(w) = ?n+2 ⑼ = … = 专 n(w);weset 不⑼ = 专 n (w) for / C Zu ⑼， 

-/r=1 

n +1 v 

rjfc(a)) J and r^(aj) = T n(w). Thus Xt is defined for all / C [0, re)* 

P[tq = qo] = 1, then Xt = {Xt y / ^ 0) is so defined. If P[tq = oo] 
<1， then we continue the construction with re as with t 。： we choose 
an arbitrary r.v, ^ independent of the 专 n ， r n , choose re^\ with 
戶 (t 師 i ^ / I ?o, t 0 , • • •, r e ) = 厂 _， set X t {^) = for / C [^(w), 

巧 +i(w))，and so on, starting over, if necessary, at the new accumulation 
points of jump times with r.v.’s with distribution Pq of It is in¬ 
tuitive that this defines Xt = {X ty t ^ 0)，but we shall not prove it, 
for the proof requires the use of ordinals. 

Xt is a stationary Markov r.f. and the《-pair derives from its tr.pr., 
as follows: Note that P[r > /] = e~ qi implies that P{r > s t \ r > s) 
= e~ qt for any s > 0. This means that if we stop the construction when 
we reach a r > s and start it anew at s in lieu of at 0 but use X s in lieu 
of 专 o, then the r and ^ which follow have the same distribution as when 
the construction was not stopped. Thus, P{X s j t t C S X ry r ^ s)= 
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P{X s +t C S ( X s ) and is independent of s y and the above assertion is 
true. 

If P[re = °°] = 1 then, up to the choice of 专 0 = 為， the rX 办 is the 
only one which conforms to what precedes 2°. Therefore, the tr.pr. is 
unique and the ^pair derives from it. If P[tq < oo] < 1 ， then the con¬ 
structed Xt and its tr.pr. depend upon the choice of Pe. 

The converse assertions are proved, and the proof is terminated. 

§ 45. MARKOV SEMI-GROUPS 

45.1. Generalities. Markov semi-groups on G characterize stationary 
Markov laws of evolution. Their analysis requires introduction of 
analytical concepts (limits, continuity, integration, differentiation) in G. 

We recall the notation to be used throughout. Unless otherwise stated, 
times r, s y /, with or without affixes, are points of T = [0, oo), states 
x y y y z y with or without affixes, are points of a locally compact separable 
metric state space 9C，sets S y with or without affixes, are topological Borel 
sets — sets of the onfield S generated by the class of open sets in 9C, and 
V x {^) are open spheres of radius e centered at x. 

To fix the ideas, we take the state space to be a Borel set in R with the 、 
usual topology in it. What follows extends at once to the general case. 

The space G is the Banach space of all bounded Borel f.’s ^ on 9C 
with the uniform norm || ^ || = sup | | for every ^ C G. The space 

X 

$ is the Banach space of all bounded signed measures ^ on S with the var¬ 
iation norm \\ <p\\ = Var <p = y? + (9C) + y?—(9C); in particular, all pr. meas¬ 
ures 8 X which degenerate at x belong to The elements <p of may be 
considered as linear functionals on G: 

<p(g) = (<py g) = <P a 

However, 4> is not the adjoint space of G; it is only a “reciprocal” sub¬ 
space of it, that is, such that II g = sup ((p y g) for every g C. G (follows 

W 彡 1 . ♦. 

upon using the 8 X ). It ought to be noted that 伞 is the adjoint of the 
subspace C\)(CI G) of bounded continuous functions on 9C vanishing at 

infinity. ， 

We introduce two concepts of limit or types of convergence in G. 

Let / ― > to* 

Strong convergence means uniform point wise convergence: ^^ s( x ) 

uniformly in x C 9C，we say that g t converges strongly to g or that g is 
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strong limit of g h and we write g t g. Thus , 幻二兄 is equivalent 
to gt — ^ || —> 0, that is, to convergence in norm. 

^-weak convergence means bounded pointwise convergence: If g t (x) —> 
^•(x) for every x ^ dC and the g t are uniformly bounded, we say that gt 
converges ^-weakly to ^ or that g is ^-weak limit of g ti and we write 

gt ^ g. Clearly, strong convergence implies weak convergence ， and it 

is easily seen that gt ^ g ^ equivalent to ((p y gt) —^ (<p y g) for every 
〔伞 (use the Banach-Steinhaus uniform boundedness theorem). Thus, 
if we limit ourselves to the subspace Cq (so that <J> is its adjoint space), 
then $-weak convergence becomes the usual ‘‘weak’’ convergence in Cq. 
This explains the “ <J>-weak” terminology. 

To each of the foregoing concepts of limit correspond concepts of con¬ 
tinuity, differentiation, and integration. Let g t be a function in / C b] 
R with values in G. Let / 7 ^ / and 0 < A — 0. The function g t is 

strongly continuous at / if gr ^ gt y and it is strongly differentiable at / if 
—幻 ）/(〆 一 0 converges strongly, necessarily to an element of G — 
to be called strong derivative of gt at / and to be denoted by Dg t . If 
/’ = / + A (or /’ = / 一 h) y then the derivative is from the right (or left) 
and denoted by D+g t (or D 一 ; the derivative at a (or b) is necessarily 
from the right (or left). We drop “at when the foregoing properties 
hold for every /. The same definitions apply upon replacing “strong” 
by “tweak，” ‘V， by ‘ V， and “D” by “^ • ，， Clearly, strong (tweak) 
differentiability implies strong ($-weak) continuity. In fact 

a . 电 -weak differentiability implies strong continuity. 

For ， if ( 兄 " 一幻 )/(〆 一 0 converges weakly hence boundedly, then 

\ gt^ gt\\ ^ ^ I ^ ^ I — 0. 

The function g t is strongly integrable on a bounded interval [c y d) 
if its Riemann sums converge strongly in the usual way. The limit is 
then necessarily an element of G to be called the strong integral of gt on 

[r, d) and denoted by gt dt 

Strong integrals on unbounded intervals are defined by strong passages 
to the limit exactly as for the improper Riemann integrals. The usual 
properties of Riemann integrals remain valid: change of variables, ad- 
ditivity ? intcgrsbility of" strongly continuous functions on bounded in¬ 
tervals and also on unbounded intervals when these functions are 
bounded in norm by numerical functions integrable on these intervals. 

Similarly, the inequality J gt dt ^ J || gt || remains valid, and 
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the convergence property: as A ^ 0, 

c+h 

— gc when g t g c as / — r + 0. 



The tweak integral g [C}d) C G is defined by g [Ct d)(x) = I g t ( x ) dt for 

functions gt measurable in {x y t) and bounded in norm by numerical 
Lebesgue integrable functions. The convergence property holds: as 

々 — 0 ， gic fC -\-h) gc when g t A g c as / ^ r + 0, and the Fubini 
theorem with finite measures m on 9C applies: 



Let T y with or without affixes, denote an endomorphism on G— a 
linear bounded mapping on G to G: 

T(ag + a'g') = aTg- {- a!Tg\ || Tg || ^ c\\ g ||, c < ». 

The smallest possible value of r as ^ varies, is || T || =sup || Tg\\ or 

11^ II 

the norm of T, If g ^ 0 Tg ^ 0 v/t say that T is nonnegative, and 

we say that T is a contraction. Multiplication by scalars, 
addition, and multiplication of endomorphisms, defined by 

Wg = a(Tg) y {T+T)g =Tg+ Vg y (TT)g = T{Tg) 

yield endomorphisms. It follows that the space 8 of our endomorphisms 
is linear and with the foregoing norm becomes a Banach space. Further¬ 
more, multiplication of endomorphisms commutes with their multiplica¬ 
tion by scalars, is distributive with respect to addition, and TI = IT 
where I is the identity mapping. Thus 8 is an “algebra with unit I” 
and since || TT f || ^ || T || * || T f || it is a “Banach algebra:” 

b. The Banach space oj endomorphisms on a Banach space is.a Banach 
algebra. 

Let / — /o. In 8 on our space G we have at our disposal the usual 
convergence in norm T t — T \\ — > 0 or uniform convergence and the 

S 

types of convergence induced by those in G: strong convergence T t T 
meaning T t g —> Tg for every g G and ^-weak convergence T t ^ T 
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meaning Ttg ^ Tg for every g C. G. To each of these types of con¬ 
vergence correspond types of limits hence of continuity and of integrals. 
For example: If T t is uniformly continuous in / C [^> d] y then the uniform 

integral I T t dt defined as uniform limit of corresponding Riemann 

^ c 

sums exists and is an endomorphism. Thestrongand^-weak integrals are in- 



d 


duced by those for g C G: the strong integral is defined by 

d 

T t g dt for all g G y and the 伞 -weak integral is defined by 


Ttdt)g 



c 



T t dt )g 


， d) 



Ttgdt 


[C，《0 


for all g CG. 

If an endomorphism T on G and an endomorphism on <J> are such 
that, for all ^ C G, v? C 

(<py Tg) = J<p{dx)Tg{x) U<p(dx)g(x) = (U<p ， g )， 

then we shall set U<p = <pT y write the above relation (v?, Tg) = (<pT ， g )， 
and say that Tis 电 -adjoint — its adjoint on the adjoint space of G leaves $ 
invariant. Clearly 


c. Endomorphtsms T and strong passages to the limit commute ， 

But generally this is not true of <J>-weak passages to the limit. However 

c / . 电 -adjoint endomorphtsms T commute with ^-weak passages to the 
limit • 

For, g t ^ (<p y Tg t ) = {ipT y g t ) —> (<pT y g) = (v?, Tg) Tg t ^ Tg. 

We are now ready for the introduction of Markov endomorphisms. 
Let P(x y S) denote Borel functions in x d 9C and pr.’s in ^ C S or, more 
generally, measures in S bounded by 1. To every P(x y S) there cor¬ 
responds an endomorphism T on G and an endomorphism U on to 
be called Markov endomorphisms ， defined by 

Tg{x) = dy)g{y) y U<p{S) <p{dx)P{x y S). 

Clearly, T and U are nonnegative contractions, and when P(x y S)^ is a 
pr. in S so that P(x y 9C) = 1 for all x then || 7" || = || ^ II = 1 - Either 
Tor U determines P(x y S);\t suffices to take ^(0 = /(*, ^) or <p (-)= 
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I{x y •)• In fact, T is ^-adjoint (to U) since 

Tg) =J<p{dx)P{x y dy)g{y) = (U<p y g). 

We shall concentrate on Markov endomorphisms T on G and have 

A. Markov endomorphisms criterion. An endomorphism T on G 
is Markovian if and only if it is a nonnegative 电 -adjoint contraction. 

Proof. The “only if” assertion is contained in what precedes. As 
for the “if” assertion, let T be a nonnegative contraction <i>-adjoint (to 
U). Set P(x y S) = A X (S) = I(x y S) T (where the last term stands for U<p(S) 
with v?(*) = I(x y •))• Since every <pT is a measure so is A x and, from 
(<p y Tg) = ((pT y g) it follows that 

Tg(x) =JI(x y dy)Tg(y) =Jp(x, dy)g{y). 

In particular, TI(x y S) = P(x y S) C G and, since our Tg is a non¬ 
negative contraction and I(x y S) is bounded by 1, it follows that P(x y S) 
is a nonnegative Borel function in ^ bounded by 1. The proof is ter¬ 
minated. 

Let Pt(x y S) be a stationary tr.pr. except that in lieu of P t (x ， 9C) = 1 
we assume only that Pt(x y 9C) ^ 1, unless otherwise stated. In terms 
of Markov r.f/s this assumption may mean that its r.v/s when numerical 
may take infinite values with positive pr/s. According to what pre¬ 
cedes, Pt(x y S) as a Borel function in x and a measure in 6* bounded by 1 
determines and is determined by a Markov endomorphism Tt on G de¬ 
fined by 

Ttg{x) =JP t (x, dy)g{y), g C G. 

There remains the stationary Kolmogorov equation, which links the 
values of the tr.pr. for different values of / and which, by 

T a ^tg{^) = J Ps^,dz)g{z) P s {^,dy)P t {y y dz)g{z) = T s T t g(x) y 

is equivalent to the semi-group property T s ^ t = T s T t . 

We say that this family of Markov endormorphisms, which is in a 
one-to-one correspondence with a stationary tr.pr. hence with the cor¬ 
responding law of evolution of a stationary Markov process, is a Markov 
semi-group. Unless otherwise stated, semi-groups are semi-groups of 
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endomorphisms on G and all endomorphisms are on G. If a semi-group 
consists of ^-adjoint (nonnegative) (contraction) endomorphisms we say 
that it is a 电 -adjoint {nonnegative) {contraction) semi-group. If, moreover, 
all functions T t g(x) are Borelian in (x y t) we say that the semi-group is 

Borelian. 

B. Markov semi-groups criterion. A semi-group is a Markov semi¬ 
group if and only if it is a nonnegative 电 -adjoint contraction semi - group. 

A Markov semi-group is Borelian if and only if the corresponding sta¬ 
tionary tr.pr. is Borelian. 

Proof. The first assertion follows from A. The second assertion fol- 
lows from the integral form of Ttg{x)^ the only if part upon taking 
g[.) = 1 (., S) and the “if” part upon approximating g by simple func¬ 
tions. 

45 .2. Analysis of semi-groups. While our concern is with Markov 
semi-groups，the concepts and general properties below are valid for 
more general contraction semi-groups of endomorphisms on Banach 
spaces G y and they are stated accordingly. As usual, the limits in h are 
taken as 0 < A -► 0, and if a property holds at all /, we drop M at t y \ 

Let {T h t ^ 0) be a contraction semi-group: 

T s+t - T s T h T 0 = /, || T t || ^ 1, s y l^0. 

A set G 0 in G is invariant (by the semi-group) if all T t G 0 [ G。； in other 
words there is a restriction of the semi-group on G 0 to Go- The whole 
space G and the singleton which consists of the origin of G are trivially 

invariant. • 

We denote by G c the set on which our semi-group is strongly con¬ 
tinuous at / = 0: C <=» T h g —> g- 

a. Strong continuity lemma. G c is the set on which the contraction 
semi-group is strongly continuous, and it is an invariant Banach subspace. 

Proof. Clearly G c contains the strong continuity set of the semi¬ 
group. But G c is also contained in this set. and is invariant. For, 

(T h - I)T t - T t (T h - I) and, for every g C G c , 

|| Tt +h g - T t g || = || T t {T k - I)g || ^ || (T h - I)g || ^ 0, 

|| T t g-T^ h g\\ = || T t ^{T h -I)g\\ ^0. 

G c is obviously closed under linear combinations. It is also closed under 
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strong passages to the limit by sequences g n ^ gy gn C G cy hence is a 
Banach space. For, as A —> 0 then « — oo ， 

I || ^ || T h (g - g n ) || + || (T h - I)g n || + || 心一《 || 

= 2 II ^ ~ Sn II + II (Th - I)g n || —> 0. 

The proof is terminated. 

We denote by Gd the set on which the semi-group is strongly differenti- 

able at / = 0 ： ^ C ^ D^g Dg\ Dh = (Th — I)/h. Clearly Gd C G c . 
The (strong) differentiation operator D on G d to G is also called the {strong) 
infinitesimal operator or the generator of the semi-group. For, it gen¬ 
erates the semi-group at least on G c , as will be seen later. Clearly, D is 
a linear operator on the obviously linear space Gd. But in general, D is 
not bounded and Gd is not a Banach subspace. Note that if g C. G c 

and a y b are finite then the strong integral g a b = I T t g dt exists. For, 

^ a 

the function T t g is then strongly continuous in /• Furthermore g a b C G ( i 
since 

/% ^ /% & p b-\-h 广 

— I)g a h = I Tt^hgdt — I T t gdt = I Ttgdt ― I T t gdt 

^ a J a J b J a 

implies that 

D h ga b Mn - T a )g= D ga b ； 

in particular 

^ 3 7 T T t g dt 、 g C Gc. 

h Jq 

b. Strong differentiation lemma. Gd is the set on which the con¬ 
traction semi-group is strongly differentiable and it is an invariant set dense 
in G c with DTt = TtD on it. 

Proof, Clearly, Gd contains the strong differentiability set of the semi¬ 
group. But Gd is also contained in this set and is invariant with DT t = 
TtD on it. For, ifC Gd y then 

D h (T t g) = (T t + h - T t )g/h = T t {D h g) A T_ = D(T t g) y 

hence T t g C G d and D(T t g) = T t (Dg) y and 

(T t - T t ^ h )g/h = T t — h (D h g) = T t ^ h {D hg - Dg) + T t — h Dg 二 T t (Dg). 
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Finally, Gd is dense in G c since every g C. G c \s strong limit of elements 

1 r h . • 

一 I T t g dt of G 心 The proof is terminated. 

h J 0 

We replace now strong limits by <I>-weak limits- If we try to proceed 
as for the strong continuity, we find that left limits have to be left out 


and we need the functional form (#， ^ n ) — (<p y g) of g n A g hence，to 
use 40.1c’，we assume that the semi-group is 伞 -adjoint. If we try to pro¬ 
ceed as for the strong differentiability, we find that in order to introduce 
$-weak integrals to establish the closure property, we have to assume 
that the semi-group is Borelian. We are then led without any difficulty to 
the following definitions and propositions. Denote by G f c the set on 
which the semi-group is <I>-weakly continuous at / = 0: ^ C <=» 

T h g ' g. 


a ’. 伞 -weak continuity lemma. G’ c is the set on which the 电 -adjoint 
contraction semi-group is ^-weakly right continuous^ and it is an invariant 
Banach subspace. 

Denote by G f d the set on which the semi-group is<I>-weakly differentiable 

at / = 0: ^ C G f d <=> Dhg D’g. The 电 -weak differentiation operator 
D' is also called the ^-weak infinitesimal operator or the ^-weak generator 
of the semi-group. Then 

b' 4>-weak differentiation lemma. G f d is the set on which the 龟 -ad¬ 
joint contraction semi-group is ^-weakly right differentiable y and it is an in¬ 
variant set with D’Tt = Tjy on it. 

moreover^ the semi-group is Borelian，then the 电 -weak closure of G f d 
contains G’ c . 

The four spaces so distinguished are related by 

c. Inclusion and closure lemma. The invariant continuity and dif¬ 
ferentiation sets of a 电 -adjoint contraction semi-group are ordered by the 
inclusions GdCG’d C G c C ： G f cy and G c is the strong closure of G d . 

If y moreover^ the semi-group is Borelian, then these four sets have com¬ 
mon 中 -weak closure G’ c . 

Proof. The first part follows from the preceding lemmas except for 
G f d C G c which follows from 40.1a. The second part follows from the 
asserted ordering and the last assertion in b' 

Instead of replacing “strong” by “ <J>-weak” we may replace it by 
“uniform •” Then much more is true as follows. Let E be an endo¬ 
morphism on a Banach space and let E n be its w-th iterate (E° = /). 
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Clearly, the exponentials e tE defined by e tE = 一 E n are endomor- 

phisms bounded by and form a semi-group*— the exponential semi¬ 
group generated by E. This semi-group is uniformly continuous and 
uniformly differentiable with “uniform differential operator” E y since 

e (t-\-h)E _ e tE = e tE( e hE 一 e tE 一 e (t-h)E = e (t_h)E( e hE — 

and 

HE _ T 

——-£ $ (/ 酬 - 1 -h\\E\\)/h ^0. 

h 

d. Uniform continuity and differentiation lemma. If the con¬ 
traction semi-group is uniformly continuous at t = 0 on an invariant 
Banach subspace G uy then the semi-group is exponential on it. 

Proof • G u is invariant; and the semi-group is uniformly continuous on 
it since 

I T t +h — T t || S || - /1| 0 and || T t — T t ^h || ^ || - /1| —0 

on G u . From here on we consider the semi-group on the invariant 

Banach space G u only. Clearly, the uniform integral A = I T s ds ex- 

ists and Ih/h /. Therefore, for h = ho sufficiently small ， / 办 0 has an 
inverse 八 。一 1 • Let £ = (T ho - /)4 。一 1 • It follows from 

广奴 o // 

(Tt - I) I Tsds = I Tsds -\ Ts ds = (T ho - I) I Tsds 
Jo J t ^0 

that ^ 

Tt - / = £ C Ts 

hence D h = Eh/h ^ E. Also, proceeding by induction, 

n t k r t (( _ s )n 

T t -I^Z -£" + £ n+1 ― —T s ds 

Ar_l 々 ！ Jo n • 

where the norm of the right side summation is bounded by e and 
that of the remaining term is bounded by / n+1 || £ || n+1 /(” + 1)! — 0. 

oo 

Thus T, = E - = e tE y and the proof is terminated. 

»0 ^ • 
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Local characterization. The purely analytical problem relative to 
Markov processes (regular and stationary) corresponding to stationary 
tr.pr/s, hence to Markov semi-groups, can now be stated as follows ： 
Characterize Markov semi-groups in terms of their local properties. It 
is a specialization of the same problem for general contraction semi¬ 
groups to nonnegative <J>-adjoint endomorphisms on our function space 
G. Thus, we first treat the general case (then the results extend at once 
to general Banach spaces G )， and specialize it later. The problem may be 
decomposed as follows: The existence problem is that of characterizing 
infinitesimal operators of contraction semi-groups, the uni city problem is 
that of characterizing those infinitesimal operators which determine 
their semi-group, and the generation problem is that of constructing the 
corresponding semi-groups. However, at best we may hope for an¬ 
swers on the strong or <J>-weak closures of the* domains of the infinitesimal 
operator. Thus appears the extension problem: find conditions under 
which extensions of semi-groups exist and are unique on domains con¬ 
taining the space G with which we are concerned. 

In the numerical case, where the contractions are endomorphisms on 
R y the semi-group property reduces to the classical functional equation 
f(s + /) :: with /(0) = 1， |/(/) I ^ 1. The only continuous and, 

in fact, the only measurable solutions are exponential:/(/) = e td where 

d ^ 0. In the general case, the formal solution of the equation—** T t = 

w 、 at 

DT t with 7"o = / is similarly exponential T t = e tD which would require 
that the infinitesimal operator D be an endomorphism. Yet, an in¬ 
finitesimal operator D y while not necessarily an endomorphism, is al¬ 
ways the limit of endomorphisms Dn = (7\ — I)/h. This fact, as well 
as the numerical case and the formal approach, lead us to expect that, 
at least on the strong closure G c of the domain Gd of D y the corresponding 
semi-group would be determined by D and could be represented as limit 
of exponential semi-groups. We shall show that these expectations are 
justified. First, we have to introduce the necessary tool 一 the Laplace 
transform or “resolvent” of a contraction semi-group. 

The {strong) resolvent /? 入， on the (strong) continuity subspace G c of 
the contraction semi-group (T h / ^ 0) is defined by the strong integrals 

^oo 

Rxg= I e^Ttgdt, gCG cy XC(0,oo). 

R\g exists and belongs to the Banach subspace G cy since the integrand is 
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strongly continuous and is bounded in norm by e^ u \\ z II whose integral 

on[0,-)is ,||/X. Clearly^ is Hnear. Thus is In endolrph^m 
on with (I \R\ |J ^ 1. The basic role of the resolvent is due to 

e. Resolvent lemma. The resolvent R\ of the contraction semi-group 
w/M. infinitesimal operator D on Gd is the inverse of the one-to-one mapptnz 
\I — D of Gd onto G c . 

Proof. If g CG c then, from 


T h R x g 


- I e^Tt^gdt 

•^0 


，入 A I a —\tnr* 作 


Ttgdt 


Rxg - I e^Ttgdt)^ 


it follows that 


DhR\g 




1 e Xh ^ 

- R\g — —J 


e^^Ttgdt —► (\R\ — I)g ， 


hence R\g C Gd» If ^ C G^(c G c ) then, moreover, 


8 


Therefore ， 


DhR\g = R\Dhg R\Dg. 


DR\ = \R\ — / on G cy R\D = \R\ — / on Gd y 


equivalently, 

( 入 / — D)R\ = 7 on G 0 R\(\I — D) = 7 on Gd* 

The proof is terminated. 

The theorem which follows will lead us to the ‘‘natural’’ appearance of 
spaces of continuous functions 


A. Unicity theorem. Let D with its domain Gd be the infinitesimal 
operator of a contraction semi-group. 

(i) D determines the semi-group on its strong continuity space G c and 
determines the semi-group on the ^tveak closure G ; of G c when the contrac¬ 
tions are 电 -adjoint. 
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^fty/t C Gdy which is 


bounded、reduces to g C Gd for 
derivative^ is f t = T t g y g C Gd* 


0 and has a strongly continuous strong 


Proof 1 0 Given D on the strong differentiability set Gd ，the strong 
continuity space G c is determined as strong closure of Gd and then, by 
e ， 入 on G c is determined. Therefore, by the classical unicity theorem 


for numerical Laplace transforms R\g(x) = J e^ u T t g(x) dt of con 



tinuous functions T t g(x) in /, these functions are determined. The G r - 
assertion is proved, and the G’-assertion follows since 中 -adjoint endo- 
morphisms commute with $-weak passages to the limit. 

2° According to b y / t = T t g y g C Gd is a solution of the stated equa¬ 
tion with the asserted properties. The “unicity” assertion will follow 
if we show that such a solution, which vanishes for / = 0 (in lieu of re¬ 
ducing to g)y vanishes for all /. Let g t = and note that, according 

dgt 

to the hypotheses made ， 如 = 0,—— C G c and g 8 — 0 as J —> 

dt 

• ^St 

On account of the stated equation, we have —— =(Z) — }J)g t hence, 

dt 

dgt 

by e, gt = 一尺入 — .It follows that for all 入 (> 0)，as j oo, 



[ft 出 一 j gt dt —— 




s 


Therefore, by the unicity theorem recalled above,= 0. The proof is 
terminated. 

Upon replacing “strong” by “<lMveak，’’ parallel definitions and slightly 
more involved but similar arguments yield without difficulty 

e' <J>~weak. resolvent lemma. The 龟 - weak resolvent R\ of a 电 -adjoint 
contraction Bor el semi-group is the inverse of the one-to-one mapping 
)d - D’ oj G f d onto G f c . 

A\ 中 -weak unicity theorem. Let D’ with its domain Ga be the 
^>-weak infinitesimal operator oj a 电 -adjoint contraction Bor el semi - group. 

(i) D f determines the semi-group on the common 中 - weak closure G f of 
its continuity and differentiation spaces. 

d 七 f t f • 

(ii) The unique solution of the equation —— = D f ft y ft C G’d，which 
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is bounded and reduces to G f d jor t = 0 is measurable in (x y /) auJ con¬ 
tinuous in t and has a right continuous right derivative bounded on every 
finite interval of values of t y is f t = T t g y g C G、. 

The foregoing unicity theorems lead at once to the following question: 
when is the closure of a subset of G the whole of G? For then, the semi¬ 
group is completely determined — on G 一 and the extension problem is 
solved in the affirmative. An answer is immediately available if we re¬ 
call the Baire definition of Borel functions: The class of Borel functions 
on a Euclidian space 9C is the closure of its subclass of continuous func¬ 
tions under pointwise passages to the limit of sequences. The space C 
of bounded continuous functions on 9C is a Banach subspace of the 
Banach space G of bounded Borel functions on 9C. Thus, G is the 中 -weak 
closure of its Banach subspace C. 

B. C-extension theorem. If the ^>-weak closure of the strong 
weak) continuity subspace oj a 龟 -adjoint {Borel) contraction semi-group con¬ 
tains the subspace C then the semi-group is completely determined by its 
strong ( 伞 -weak) infinitesimal operator. In particular, the semi-group is 
completely determined when C is invariant and the semi-group is strongly 
{weakly) continuous on C. 

Moreover y if the contraction Borel semi-group leaves C 0 invariant and is 
电 -weakly continuous on C 0 , then it is $ adjoint y strongly continuous on C 0 , 
and 、 in jact 、 “weak” and “strong” concepts coincide in Co. 

Proof. The determination assertion results from the unicity theorems 
and the $-weak closure property of C. The infinitesimal operators as- 
sertion results from the continuity assertion since then, by the resolvent 
lemma, R\ is a one-to-one mapping of C C\ G c onto C C\ Gd and of 
C fl G f c onto C n G f d^ 

Since 中 is the adjoint of Co, the “$-adjoint” assertion is immediate. 
The “weak”_"strong” assertion follows from 40.3d. The proof is 
terminated. 

The stated problems are answered in terms of (strong) infinitesimal 
operators as follows. 

Let (T x ,ty / ^ 0) be a family in 入 > 0 of strongly continuous contract 
tion semi-groups on a common invariant Banach space H. Take ^ 
limits as 入 ， m — °°，unless otherwise stated. We say that these sentr 
groups converge strongly if on H there exist endomorphisms Tt y necessarih 

contractions, such that T\ t t Tt uniformly in / ^ ^ for any finite 
I T Xtt g - T t g\\ 0 uniformly in t ^ a, for every g C It is easil 、 
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seen that such convergence implies and is implied by the corresponding 

mutual convergence T\ tt 一 7^ —> 0 uniformly in t ^ a y and that if 
either convergence holds on a set dense in H then it holds on H. Further¬ 
more, the limits T t form a contraction semi-group. Finally, because of 
the uniformity condition, it is strongly continuous on H and, for v > 0 y 
畧 €1 //， as 入一 > oo ， oo oo 

R Vt xg = I e^T^tgdt A I e— vt T t g dt = R vg . 

^0 Jo 

f. Semi-groups convergence lemma. On a Banach space H y if 

endomorphisms D\ commute^ || e tL>x || ^ 1, and D\ A D on a set H f dense 
in H y then the strongly {in fact 、 uniformly) continuous exponential con¬ 
traction semigroups (e tDx y / ^ 0) converge to a strongly continuous con¬ 
traction semi-group {T^ / ^ 0) whose infinitesimal operator is D on 

H d 3 H f . 

Proof. We use two elementary relations applicable to commuting 
endomorphisms: 

\a n - ^\ y | a |, |/3| ^ 1, 


Let g C t ^ a y and exclude the trivial case D\g = D^g. As 打 —qo, 


(e tD ^ — e tD ^)g ^ a 


// n ) ⑺入- / 

tjn 


g 


— a || (Dx - D^)g\\. 


As X， m — °°，the last expression converges to 0， and the convergence 
assertion follows. 

For any X， let R\ be the resolvent of the semi - group of endomorphisms 

e tDv and let 穴入 be that of the limit semi-group, so that 穴入 as 

v —> oo. Since 


W - D)g = R X V (KI - D v )g+ Rx v (D v - D)g y 

where on the right side the first term is g and the norm of the second 
term is bounded by || (D p — D)g ||/X ― > 0 as ^ °o, it follows that 

R\(\I — D)g = g and the infinitesimal operator assertion follows by the 
resolvent lemma. The proof is terminated. 

We are now ready for the basic Hille-Yosida 
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C* Infinitesimal operators criterion. A linear transformation D 

a linear subset Hd of a Banach space H is the infinitesimal opevatov of 

one an ^ only one strongly continuous contraction semi - group (T“ / > 0) ow 
H y if and only if 

(i) The R\ of the trutisfovututioTiXl ' D exists uYid is uyi ctido- 

morphism on H wM || 入 || ^ 1 for every 入〉 0. 

S 

(ii) \R\ ^ I on H as \ — > <x> or (ii 7 ) is dense in H. 

Then the exponential semigroups (e tD \ t ^ 0), where D x = 入(入穴入 一 /)， 
converge strongly to the semigroup ( 乃 ， / g 0) 似入 —① • ’ 

Proof. Let 入 —Properties (ii) and (\V) are equivalent under (i )： 

Let \R\g —> g for every g ^ H. Then, from \R\g €1 Hd it follows that 
Hd is dense in H. Conversely, let H d be dense in H. Since || \R\ || ^ 1 
implies that for every g £2 Hd • 

|| (\Rx - I)g || = || R\Dg\\ ^ || D^||/X -» 0, 

hence the contractions \R\ — / on H dy it follows that \R X ^ I on H. 
The “only if” and the “unicity” assertions result from a and A. The 
if assertion then results from the convergence one which we prove by 
showing that, because of (i), (ii), and (i〆)，the semi-groups convergence 
lemma applies: Since, by (i), the R\ hence the D\ commute and || \R\ | 
S 1 Hence || e tDx j| ^ 厂入 $ 1， on account of (ii’）it suffices to 

show that D\ —> D on Hd* But, by (i), R\(\I 一 D) = / on Hd hence, 

by (ii), D\ = XR\D ^ D on //. The proof is terminated. 

Corollary. Let H be an invariant subspace o/G. Under the conditions 
of the Hille-Yosida theorem^ T t ^Q for all / > 0 ==> 入尺 x g 0 /or a// 
入 > 0 ==> 〆〜 g o ybr 沒"/，入 >0==> Tt ^ 0 for all / > 0. 

For, the first implication results from the definition of R\ y the second 
, , i* / ^ (入 /) W-V v ■■ % • • m • « 


(\R\) n y and the third one is ob. 


one follows from e tDx == e^ u - (\R\) n y and the third one is ob- 

n*o 

tained by letting X —> oo. 

The definitions and properties in this subsection apply to semi-groups 
restricted to any given invariant subspace H (Z G y provided they are 
relativized accordingly, that is, G, G Cy G f cy G% G\ … are replaced by 
their intersections with H: H y H 0 G cy and so on. However, there is a 
difficulty in the 中 -weak case: 中 -weak integrals, supposed to exist, of 
functions gt C H may not belong to //, and some restriction is needed 
to eliminate this difficulty. For example, It suffices that H be the space 
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C. This brings out once more the advantage of invariant C. In fact, 
our study became fast restricted to invariant subspaces selected accord¬ 
ing to the semi-group continuity requirements. Thus we are led to 
classify our semi-groups according to their invariant subspaces selected 
according to suitable requirements. What precedes, and the C-unicity 
theorem show the convenience of the class for which C is invariant. 

45.3. Markov processes and semi-groups. We apply what precedes 
to the Markov case, searching for a probabilistic interpretation of the 
concepts and properties. 

Let Xr y T = [0, oo), be a stationary regular Markov process. There 
is a one-to-one correspondence between the stationary law of evolution 
P Xo = (P x y ^ c 9C) of Zr, the tr.pr. P t (x y S) = P(X 8 ^t CS\X 8 ^ x) y 
the tr.d.f. F t x {y) = P(X … <y \X 3 = x) y the tr.ch.f. f t x {u) = E{e iuX ^ \ 
X 8 = x) y and the Markov semi-group (T h / ^ 0) with 

T t g(x) = J P t (x ， dy)g(y) = E x g{X t ), gCG, 


or, to emphasize stationarity, 

T t g(X a ) = E{g{X a+t )\X a ]. 

The Markov endomorphisms are adjoint to endomorphisms on the space 
defined by 

<pTt(S) = f <p{dx)P t {x y S) = f <p(dx)P x [X t e 6*] 


hence translating distributions <p = P 8 of X 8 according to 

PsT t = 

Let the tr.pr” equivalently the semi-group, be Borelian. Form the 
function ^ 

XR 认 x ， S) = xf e 一 u P t (x ， S)dt ， X > 0 . 

夕 o 

It is a Borel function in .v and a pr. in S. Therefore, the endomorphism 
defined by 

= J 入尺 xCv ， dy)g{y) y gCG y 


is Markovian. 


Since by interchanging the integrations 





e^ u T t g(x) dt y 
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endomorphisms R\ appear as an extension on G of the resolvents of 
{T ty / ^ 0). Furthermore, the iterates (\R\) n are Markov endomor¬ 
phisms defined by 


(\R\) n g(x) =J(\R x ) {n) (x y dy)g{y), g C G y 

with 

(\^ x ) (n) (^^) = J (x 穴 x) u —Dk 心 OaxCyJ). 

It follows that every e tDx with D\ = \(\R\ — I) is a Markov endomor¬ 
phism defined by 


e tDx g(x) = J(e tDx ) (x, dy)g{y) y g C G y 


with 




入 / 

= 厂 X< (1 + -\Rx(x y S)+—y (\Rx) W (x y S) +♦••). 

X • ^ ♦ 


Thus, the exponential Markov semi-groups (e tDx y / ^ 0) appear as ex¬ 
tensions of the exponential semi-groups of the infinitesimal operators 
criterion. Clearly, what precedes applies to the relativization on an in¬ 
variant subspace H• 

Finally, D h = {T k - I)/h given by 

D h g(^) = f (Ph( x ， dy) - dy))g{y)/h = E x {g{X h ) - g{^))/h y h > 0 y 


or, to emphasize stationarity, 

Dhg(Xs) = E{g(X a+ h) — g(X 8 ) j X 8 ]/h 

舉 

may be thought of as a ‘‘mean speed’’ operator. If 

E[g(X 8 +h) — g{X 8 ) I X s )/h Dg(X a )y g C 

the “speed” operator D on G f appears as an extension of the infinitesimal 
operators of (Tt y t ^ 0), 

We pursue our specialization to Markov semi-groups. We recall that 
C denotes the subspace of G formed by bounded continuous functions on 
the state space 9C. We denote by C u the subspace of uniformly continuous 
functions on 9C and by C 0 that of continuous functions on 9C vanishing at 

infinity 、 and have 


CZD C U Z) C 0 . 
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For any locally compact space 9C, ^ C Co means that for every € > 0 
there is a compact K € d dC such that | 足 (x) | < e for x ^ we write 
足 (x) —0 as x* —oo. (Note that if 9C is compact there is no “point at 
infinity .’’） If 9C is compact, the above condition is void and, also, 
continuity becomes uniform. Thus 

If 9C is compact、then C = C u = Cq. 

The restriction of a Markov semi-group on // C G is defined by 

T t g(x) =Jp t (x ， dy)g{y) y g € H 

and will be called Markovian on H. 

a. The restriction of a Markov semi-group on H D Cq determines the 
semi-group. 

Proof• Upon letting g vary over G in the above integral representa¬ 
tion of T t on H y the semi-group so extended is Markovian- This 
Markovian extension is unique because T on Co determines the tr.pr. 
P t (x y S) as follows. Let [a, C 9C be a bounded interval and take 
g n (x) = 1 for a ^ x < b — \/n y = 0 for x 4 a — \/n and x ^ b y and 
linear for a — \/n ^ x ^ a and for b — \/n ^ x ^ b. Thus, g n C Co y 
g n —> I[ ath) and, by the dominated convergence theorem, 

T t g n {x) = ^Pt{x,dy)g n {y) —^ Pt( x i i a > ^))« 

Therefore, T t on C 0 determines P t {x y S) on the class of all. bounded in¬ 
tervals ^ = \ci y b) hence on the Borel field S. The lernma is proved. 

A. Markov infinitesimal operators criterion* A linear trans- 
JoTtytcitioTi D oti a subset Hd of u Banach subspuce H ZD Cq to H is the w- 
finitesimal operator of one and only one Markov semi-group (Tty t ^ 0) 
on G strongly continuous on invariant H y if and only if on H 

(i) The inverse R\ of the transformation X/ — D exists and is a 

Markov endomorphism for every 入 > 0 

(ii) ^ I oyi H cis X — > oo ot (n^) Hd ^ dctisc ifi H• 

Then、on G y the Markov extension \R\ is determined and the exponential 
Markov semigroups (e tDx y t ^ 0), where D\ = X(\R\ - /)> converge 
strongly to the semi-group (T ty / ^ 0) ^ X —> °°- 

Proof. According to the infinitesimal operators criterion，the proposi¬ 
tion holds on H when we drop “Markov” and C 0 ，， therein. Since the 
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semi-group {Tt y / ^ 0) on // is strongly continuous, hence Borelian, by 
the foregoing discussion and lemma, from // Z) C 0 it follows that we have: 

Markov (T^, / ^ 0) on G => Markov \R\ on // => Markov (e t£>x y / ^ 0) 
on // => Markov (e tDx y / ^ 0) on G =» Markov (T h / ^ 0) on G. 

The proof is terminated. 

The spaces C y C 0y C u of continuous functions appear in the convergence 
of laws criteria: 

Let Fy/yfy with same affixes if any, denote corresponding d.f/s, ch.f.’s, 
and integral ch.f.’s. According to the complete (weak) convergence 
criterion and the Helly-Bray theorem (extended lemma), we have the 
equivalences (convergence of d.f/s is, as usual, up to additive constants); 

F n ^F^/ n - ， /^j g dF n —> ^g dF y g C 
F n 二 F <^> f n Jg dF n 一 dF，g e Co. 


Note that the integral ch.f.’s correspond to the subset of functions g €1 C 0 

e iux - 1 L 

defined by 兄 Cv) - - ， 《 C 及 ， and the ch.f.’s correspond to the 

" … ix / 


subset of functions S ^ ^ defined by 《 (x) = R. Since the last 

subset is also in C u , we may replace C by C u in what precedes. 

The last equivalent forms of convergence of laws leads to the general 

concept of H-convergence of laws: F n ^ F meaning that f S dF n ^ 
for every g C H y and F n x ^ F x uniformly in x meaning that 

― » gdF x uniformly in x for every g H. But for a sta¬ 
tionary regular Markov process Xt with c.pr. {P x y x C 9C)> hence 
c.d.f/s F t x {y) = P x [X t < y] y we have 


(G) T t g(x) = E x g{X t ) =JgdF t x y gCG. 

Therefore，to //-convergence of laws correspond continuity properties 
of its semi-group on H y as follows. Let the limits in h be taken as 
0 < 々 — 0 and note that F 0 x (y) = 0 or 1 according b.s y S x or y > x. 

F h x i Fq x corresponds to T^g — ^ S for every g that iSy H d G c — 
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the weak continuity space of our semi-group. Similarly, Fk x —> F 0 X 

uniformly in x corresponds to T\g g for every g e that is, H d G c 
— the strong continuity space of our semi-group. This emphasizes the 
distinguished role of spaces Cq ， C Uy C y when the usual weak or complete 
convergence of laws are considered. 

The spaces of continuous functions also appear under “natural” re¬ 
quirements for semi-groups: stability and continuity of evolution. Ac¬ 
cording to the celebrated Hadamard principle, a “well set” evolution 
problem is stable, in the sense that small variations of initial data lead 
to small variations in the evolution. The weakest probabilistic inter¬ 
pretation would be in terms of individual laws given th.e initial state. 
To be precise, let Xr y T — [0, oo), be a regular process. We say that the 
process is stable if the laws £(Xt \ Xo = x) are continuous in x for every 
/ ， equivalently, if for every x and every /, as 〆 一 > x 

A F t x or — f t x or fg 处 ^ —fg 处 7, g € C. 


For a stationary regular Markov process Xt and its semi-group (T“ 
t ^ 0)，it suffices to use (G) to obtain 

b. A Markov semi-group leaves C invariant i/ y and only if、the corre¬ 
sponding Markov process is stable. 

For, if 〆 一 > and g e C ，then stability implies that 

T t g(x，) = fg dF^ - >^g dFt x = Ttg(x) y 
and invariance of C implies that 


f g 


dF t xf = T t g{x f ) 


T t g(x) = Jg 


dF t x . 


What precedes remains valid with “w” and “Co” in lieu of “〆 ’ and “C” 
except that we obtain Ttg ^ C in lieu of Ttg €1 Cq. Thus, in order that 




— > 0 


as x f ±oo for g C C 0y equivalently F t x ， [a y b) — 0 as 一 > 士 oo for 
every bounded [^ y b\ We may interpret the conditions so obtained as 
weak stability including infinity of the process, in the following sense: 

^ Ft x for every / and x including x = 士 upon setting Var 土 。0 

= P ±M [\ X t I < oo] « 0. Thus 
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b ; . A Markov semi-group leaves C 0 invariant if y and only if 、 the corre¬ 
sponding process is weakly stable including infinity. 

The other “natural” requirement is that of continuous evolution. The 
weakest probabilistic interpretation would be that of continuity of laws 
as h — Q: F\ x — Fq x ，(Fo x (y) = 0 or 1 according as y < x or y > x) y 
that is, Ph(^y (X — e，x + e) c ) — 0 or P h (x y (x — e，x + e)) — 1 for 

every e > 0; equivalently Thg(x) = Jg dF^ —> g(x) y g C. C y that is, 

C C G f c . Thus 

c, A Markov semi-group is 电 -weakly continuous on Cat t=0 if y and only 
if、the corresponding Markov process is continuous in law at / == 0: 
P h (x、 (x — e，x + €)) — 1 for every x and every e > 0. 

If we consider C 0 and C u> then uniform conditions appear. We shall 
denote by K closed bounded intervals in the state space and set V x {t) = 
(;c — 6, x 十 e)，whenever convenient; in fact, what follows is valid with 
K interpreted as a compact set. 

c r . A Markov semi-group is strongly continuous on at t ^ 0 only if 
P h (xy ( 夂 一 e, x + e)) — > 1 uniformly in x K for every K and every 
6 > 0 . 

Proof. Since every K is compact, it can be covered by a finite number 
of V Xk {^/A). Thus every x C K belongs to some ^^.(e/4)，so that 
F x (e) ZD V Tk {€/2) and the ‘‘only if’’ assertion reduces to Phi^y ^x k ( e /^)) 

—1 uniformly inx C By hypothesis, T h g(x) P h (x y dy)g{y) 

—> g(x) uniformly in ^ for every g C C 0 . Since we can select g C C 0 
such that 0 ^ ^ ^ 1, = 1 on the closure V Xk (e/4；) and 畧 = 0 on 
^/(e/2), it follows then that, uniformly \r\ x C. ^ r fc (^/4), 

P h (x y V Xk {,/2)) ^ JP h {^dy)g{y) g(x) = L 

The assertion is proved. 

c ,/ . A Markov setni-group is strongly continuous oti C u at t = 0 if 
P h (x y Cv — e, x* + e)) — 1 uniformly in x for every e > 0. 

Proof. Let g C C u . If 尸 (^ — €, x* + e)) —> 1 uniformly in x 
then, from || g || ^ c < ^ and 
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Thg(x) - g(x) 

= f (s(y) - gM) dF h x {y) + f ( 兄 00 — g(x)) dF h x {y) 

J \y-x\<€ J « 

. + (Var F h x - l)g(x) } 

it follows that as ^ > 0 then e — 0 

sup I T h g(x) - g(x) I 

S sup I g{y) — g(x) I + 3c sup Ph(x y (x - e, x + e) c ) — 0, 

\y — x\ <t x 


hence T h g ^ g. 

We combine now the preceding lemmas b and c into 

B. C-INVARIANCE AND CONTINUITY CRITERION. A Markov Semi-grOUp 

leaves the space C invariant and is 伞 -weakly rightcontinuous on it i/ y and 
only i/ y the corresponding process is stable and continuous in law all = 0 ： 
Ft x — Ft x as x f x and Ph(^ y (^ — e, ^ + e)) —> 1 as h — 0’ for 
every x and every € > 0. 

If the state space 9C is compact, then C = C u = C 0 and b ’， c ， and 
c" apply. In fact, 

d. Invariance and continuity lemma. Let C be invariant and 9C 
be compact or let Cq be invariant and 9C be only locally compact• Then 
on the invariant subspace、^-weak rightcontinuity implies strong continuity. 


Proof. 1° Let C be invariant and 9C be compact. For every g C 
C, form 

^00 

g\M = XR\g(x) = j \e^ u T 9 g(x) ds y X > 0 ， 

and note that 办 G C; for, by C invariance and the dominated convergence 
theorem, 《 xCv n ) as In fact, g\ C C c : By a straight¬ 

forward computation, 

Th^\(x) = f e^ u T 8 g(x) ds 

so that, for ^ ^ 0, e^ xh Thg\(x) T g\(^) as A 丄 0. However, on a compact 
space monotone pointwise convergence implies uniform convergence 

(Dini’s lemma). Therefore, Thg\ — for g ^ 0 y hence for any g = 

畧 + - f 
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Since 9C is compact , 中 is adjoint to C. Therefore, by Riesz's repre¬ 
sentation theorem, any bounded linear functional is of the form 

M <p{dx)g{x) y <pC^y gCC. 

^\) = I 入卜 /W ds = \ e^ u J{u/\) du 

J o J 0 

f( s ) <pWT a g(x). 

When the semi-group is 伞 -weakly rightcontinuous,/( j) is right con¬ 
tinuous in s so that, as 入 — oo, 

/(:/ 入） —/(0) = <p{g) y hence <p(g\) ^ vis)* 

Thus, if there exists a ^?( •) which vanishes on C cy hence on all g\ y then it 
vanishes on all g C. Therefore, by the Hahn-Banach theorem, 

C c = C. 

2° Let Co be invariant and 9C only locally compact. The preceding 
argument applies upon noting that g x (x) — 0 as | 刈 — oo • Or, to 
compactify 9C into 9C — 9C + 丨 set 

^(oo,{oo}) = 1, P t (co, 9C) = 0 
and, for G 9C, ^ G S, set 

Pti^Xy *S) = Pti^y ^)y Pti^y { 00 } ) ^ 1 — Pt{x y 9C). 

This function is^ a tr.pr. and the corresponding Markov semi-group Tt 
leaves S 5 = C(9C) invariant: For, every function f C ^ is of the form 
J — g + c with g C. C 0i c. constant, and 

^tg = Ttgy T t c = c. 

Furthermore, 伞 -weak rightcontinuity of Tt on C 0 implies that of T t on 
? and 1° applies. The proof is terminated. 

We say that weak stability including infinity is weak stability uniform 
at infinity if at infinity Pt{x\ K) ― > 0 as 〆 一 > 土 w for every K uniformly 
in / ^ /q arbitrary but finite. 

B' Co-INVARIANCE AND CONTINUITY CRITERION. A Markov Semt-gTOUp 

/eaves the space C 0 invariant and is strongly continuous on it if y and only if 、 
the corresponding process is weakly stable uniformly at infinity and 


In particular, 


where 
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Ph(x y (x — e，+ e)) — 1 as h — 0 uniformly in x ^ K for every K 
and every 6 > 0. 

Proof. By b' and c\ C 0 -invariance and weak stability including in¬ 
finity are equivalent and strong continuity implies the uniform \r\ x ^ K 
condition. Therefore, denoting the strong continuity at / = 0 condi¬ 
tion by (i), the uniform at infinity condition by (ii), and the uniform in 
x K condition by (iii), it remains to prove that under Co-invariance, 
(i) implies (ii) and (ii) and (iii) imply (i). Thus, let Co be invariant. 

1° Given K y form 

^00 

g\(x) == XR\g(x) = f \e^ u T 8 g(x) ds y 入 > 0, 

where G Cq is positive and exceeds 1 on K. If (i) holds, then 
G C 0 (g\ is a strong integral), 

^x(^) = f 七 T r g(x、dr S e u g^) y 

^ t 

and g\(x) converges to j*(^) uniformly in x as 入 — 00 (apply A or, 
directly, set = u and note that T u /\g(x) —> g(x) uniformly in x). 
Therefore, g 1 on X for 入 o sufficiently large, so that for / ^ /o < °°> 
as x f — > 土① 

Pt« K) g Ttg u ^) ^ — W) — 0, 

and (ii) holds, 

2° Let (ii) and (iii) hold. To establish strong continuity on C 0 at 
/ = 0, it suffices to prove it on the subspace C 0 o C ： C 0 of those continuous 
functions which vanish outside K f s. For, given 5 > 0， every G Q 
can be decomposed into g f + with C Qo an d g n < 5, so that as 
為一 > 0 then 5 — > 0, 

|| T h g - II ^|| T h g' - ^ II + 25 —»• 0 

provided II Thg’ — g f || — 0. Similarly, because of (ii), Ph(^y K) ^ d 
for sufficiently small h and for ^ outside a sufficiently large interval, and 
we can assume C 0 o invariant under Th* Thus, we can take every g and 
Thg arbitrarily small outside some corresponding K sufficiently large. 
Let x 9 y C K y and let c be a bound of g. Since K is compact, given 
5 > 0 we can select e > 0 sufficiently small so that j g(y) — g(x) | < 8 
for \ y — x \ <6. It follows, by (iii), that uniformly in at, as A — 0 
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then 3—0, 

I Thg{x) — I ^ i Ph{^ y dy){g{y) — 畧⑷） I 

^ ly—X I <6 

+ I s( x ) I U - Ph(^y (^ - €, X 1 + e))} 

+ f Ph(x y dy)g(y) 

J It/—x t 

^ 5 + {1 — Ph(x y ( 夂 —e, x + e))} — 0. 

Thus (i) holds, and the proof is terminated. 

We illustrate the foregoing concepts and properties in characterizing 
the infinitesimal operators of Markov semi-groups which correspond to 
stationary continuous decomposable processes. 

Let (Z “ / ^ 0) be a stationary continuous decomposable process. Let 
F t and \f/ = (a, be the i.d. d.f/s and ch.f/s of the Y t = Z t - Z 0 . 
Our process is a stationary regular Markov process with Z t - Z 0 in¬ 
dependent of Z 0 , P x = P and 

巧 x ( z ) = P(^t ^ 2 Zq = x 1 ) = P{Yt < z 一 x) = Ft (z 一 x) . 

It follows that Ft x ― > F( X as x f > x and ― > 0 or 1 according as 

欠 ― +oo or — oo ， that is, F t x {a y b) 0 for every bounded [a y b)\ thus, 
the process is stable and weakly stable at infinity. Also 尸 〆; c，（x — e ， 
x + € ) c ) = P[ I ^ c] —> 0 uniformly in ^ for every e > 0， for the 
process is continuous in law. Therefore, according to the preceding 
propositions, the corresponding semi-group defined by 

T t g{x) dF t {z - x) =Jg( x + y) dF t {y、 、 g CG 


leaves the space C。(also C) invariant, and is strongly continuous on it. 
Thus, the Markov infinitesimal operators criterion applies, and we can 
and do limit ourselves to C 0 . Let g vary over C 0 . The resolvent R\ is 
defined by 





(夂 + J) dF t {y) \ A = Jg(x + y) dG\{y), 



\e^ xt F t dt is a d.f. (weighted by the ex- 



where the function G\ 


o 


oo 
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ponential d.f. family of d.f/s F t ) with ch.f. 

g\(u) = f 入厂 X V_ 々 = 入 /( 入一 # ⑻) • 

The operator D\ of the infinitesimal operators criterion is given by 

= 入 (入及 x - = J{g(^ +jy) - g(x))\ dG\{y) y 

and the infinitesimal operator D (in C 0 ) is the strong limit as 入 — oo 
of D\ on its domain of existence. Since, as 入 一 ► °o, 

(a, f)= 夕—入 -- — =J — 1 ) 入 dG\(y) = ^\ = (ax, ^x) 

where 

<x\ — f - 2 X dG\{y) y d^\{y) = —~~r X dG\{y) y 
J 1 + / 1 + JV 2 

the convergence theorem 22.1 D applies and 

a\ —> a, fx 二也 

In order to use this convergence property, we make appear a\ and I 入 
in the integral representation of D\g(x). Proceeding formally, it takes 
the form 

D\g(x) = g f (x)a\ +Jh g (x y y) 

where 

hgix.y) = \^g(x + ： y) - ^(x) - 〆( 欠） 

is defined by continuity at j = 0 to be h g {x y 0) — \g n {x). Always 
formally, the convergence property then yields 

D\g(x) —> Dg(x) = ag f (x) + J h g {x y y) d^{y). 

Thus, we are led to consider the operator D on the set C /r o C of 
all twice differentiable g d Cq with g\ g n G Co* Note that C"o is dense 
in Co so that the infinitesimal operator on C"o determines the Markov 
semi-group. We use the preceding notation for the Ito-Neveu theorem 


\ 1 + / 
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C* The % tijiYiitesittiul opevatovs D iti Co of Muvkov setni^gvoups cov~ 
responding to station avy continuous decotnposcible pvocesses ave of the form 

D S( X ) = 町 ’W h g {x y y)d^{y) y g C C 、、 

Proof• If g G C ff 0y then the function h g (x y y) is defined and is 
bounded and continuous in x and y with h g (x y 0)= 皆 〆 ’(x) and, by 
elementary computations, 

(i) h g (x y y) —> h g (x y 0) uniformly in x，asj — 0 

(ii) h g {x\y) —> hg{x y y) uniformly in y y as 〆 —x 

(iii) h g {x y y) —> 0 uniformly In \ y\ ^ a (arbitrary but finite), as 

X 一 > 士 QO. 

Using these properties, it suffices to go over the foregoing formal argu¬ 
ment to verify that it is valid for g C C" 0 . 

§ 46 . SAMPLE CONTINUITY AND DIFFUSION OPERATORS 

46.1. Strong Markov property and sample rightcontinuity. The 
basic results to be established here permit us to recognize the strong 
Markov property and lead to Dynkin’s blending of the semi-group and 
sample analysis, to be performed in the next subsection, 

The first basic theorem is a generalized formulation (by Yushkevitch) 
of the corollary to 38.4A. 

A. Strong Markov property theorem. Let Xt = (X ty / ^ 0) be 
stationary Markovian with measurable state space (9C, S), tr.pr. Pt{x y S) y 
and corresponding semi-group (Tt y / ^ 0) on the space G of bounded S- 
measurable Junctions g on 9C. 

If Xt and its tr.pr. are Borelian y then Xt is strongly Markovian when¬ 
ever there exists a topology in yi such that the subspace C of functions g con¬ 
tinuous in this topology is 电 -weakly dense in G and 

(i) the semi-group (T h / ^ 0) /eaves C invariant^ 

(ii) the sample functions Xr(o)) y w G are rightcontinuous* in this 
topology. 

Proof. Let r be an arbitrary time of Xt and let s y t ^ 0 be arbitrary 
degenerate times. Since Xt and its tr.pr. are Borelian, because of the 
strong Markov equivalence theorem 38.4A(ii), it suffices to prove the 
validity for r < « of the strong semi-group property 
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Tr + tg = TrTtgy g C G. 

Thus, in what follows, we restrict ourselves to = [r < <»] without 
further comment. 

oo 

Let g C. The elementary times Tn = 2 

A; ■ 1 

to r from the right. Since Xt is sample rightcontinuous (in the selected 
topology), X Tn+8 X T+8y so that g(X Tn ^. 8 ) g(X T ^. 8 ) and, conse¬ 

quently, 

Tr n ^sg(x) = E x g(X Tn+a ) ^ E x g(X T+a ) = g(x). 

The elementary times r n are times of Xr y hence, by 38.4d, are its Markov 
times and T Tn ^tg = T Tn T t g. But T t g C C hence 

Tr+tg ^ T Tn+t g = Tr n (Ttg) ^ T r T t g. 

Since C is 中 -weakly dense in G and Markov endomorphisms commute 
with $-weak passages to the limit, it follows that T r +tg = T r T t g for all 
g d G. The theorem is proved. 

Particular cases. The advantage of the foregoing general formulation 
lies in the freedom of choice of the topology, whether or not there is al¬ 
ready one in9C; the state sets C S are not necessarily topological 
Borel sets in the topology to be selected. The price of freedom is the re¬ 
quirement that C be dense in G and Xt and its tr.pr. be Borelian. This 
price is reduced when the freedom of choice is restricted, as follows. 

1° If the tr.pr. is Borelian and the state sets are topological Borel sets 
in a given metric topology in 9C, then Xt is strongly Markovian whenever (i) 
and (ii) hold in a topology at least as fine as the given metric one. 

Note that the recalled corollary enters into this particular case upon 
selecting the given topology. 

To pass from the general formulation A to the particular case, let C 
and C’ be the subspaces of those functions belonging to G which are con¬ 
tinuous in the new topology 0 and in the given metric topology G r , re¬ 
spectively. By hypothesis, 0 ZD hence C 3 C\ But the state sets 
C S are now topological Borel sets in the metric topology G’ and the 
functions g are S-measurable. Therefore C f and a fortiori C is 
中 -weakly dense in G. Also, the sample functions being rightcontinuous 


^ 7 [¥^< 2 4] conver g e 
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in 0 are rightcontinuous in 0’. Right continuity in / and measurability 
in o) of X t (o)) imply that Xt is Borelian 
The finest topology in a space is the trivial discrete topology 2D ■— in 
which all singletons hence all sets are open. With this topology, only 
condition (ii) remains: 

2° If the sample functions of Xt are rightcontinuous in the discrete 
topology in the state space、then Xt is strongly Markovian. 

For, in the discrete topology all functions on 9C are continuous, so that 
the density and invariance conditions are trivially true. Sample right 
continuity in 3D implies that, for every ^ C 9C, Phi^y {^}) 1 as 

h — 0, hence 尸 S) — 7(^, S) and, by the Kolmogorov equation, 

Pt+h(\ S) P t (x y dy)P h {y y S) — P t (x y S). Rightcontinuity in t and 

measurability in w or ^ imply that Xt and its tr.pr. are Borelian. 

The second basic theorem, essentially due to Dynkin, is as follows. 

B. Markov time theorem. Let stationary Markovian Xt “nd its 
semi-group (T h t ^0) be Borelian^ with extended resolvent R\ on G, and 
infinitesimal operators D on Gd and iy on d* If r is a Markov time of 
Xt then、whatever i?e g C. G y 

广 00 

E x {e^ T Rxg(X T )} = E X J e^ u g(Xu) du 

and, for ^ 0, R\g(X r ^t), t ^ 0) is a submartingale on 

(fi, a, P x ) - 

I/y moreover^ E x r < then^ for g C Gj y 

E x g{Xr) = g(x) + Dg(X t ) dt 

Jo 

and the same is true with D’ and G f d in lieu of D and 

Proof, We have T u g(x) = E x g(X u ) and, upon interchanging the 
integrations, 

R xg ( x ) = f e^ Xu T u g(x) du ^ E x f e^ Xu g(X u ) du. 

J 0 

Since a Markov time r of a stationary Markovian Xt is stationary and 
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0 on [ 


y it follows that 


E x {e^R x g(X T )} 


'X 


e 




e^ u g(X T+u ) du 


|x,)| 


'X 



e^ u g(X u ) du. 


T 


and the first asserted equality is proved. Similarly, for s S iy 
尺 I 足 + r , r d 


E x I e^g{X u ) du I 尤卞 ， r ^ 


so that, for ^ 0, the left side is no larger than 



e^ u g(X u ) du I X T+r , r = e^^R x g(X T ^) y 


r+« 


and the submartingale assertion follows. 

Finally, by the first asserted equality, for any g f C Gy 


E x {e^R x g\X T )} = R x g\x) 


E x f , 

Jo 


eH 、 du' 


Therefore, if g f = (\I — D)g with g G Gd Hence R\g f = g y then 


E x {e^g(X r )} = g(x) + 


£x f: 


e 


l Dg(X u ) du 




\^ Xu g(X u ) du. 


Since | g \ ^ r < «>, the last term is bounded by ^£^(1 — 厂入 T ) S c\E x r 
so that, letting X — 0, if E x r < qo then 


E X g(X T ) = g(x) + 




Dg(X u ) du. 


Similarly for D f and G f d in lieu of D and Gd y and the last assertion is 
proved. The proof is terminated. 

The Markov time theorem applies to all degenerate times / ^ 0 for 

our stationary Markovian Xt- It applies to all times of Xt 、provided 
Xt satisfies the requirements of theorem A. These requirements are of 
two different kinds: A(i) is relative to the corresponding Markov semi¬ 
group and, thus, is in terms of the tr.pr. A(ii) requires rightcontinuity 
of sample functions and, thus, is not in terms of the tr.pr. Yet, the 
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primary datum in the investigation and use of Markov property is the 
tr.pr. This leads us to a search for tr.pr.’s ， equivalently, Markov semi¬ 
groups, to which correspond sample rightcontinuous 

Given a tr.pr. on a measurable state space (9C, S )， the preliminary 
question is whether there exists a corresponding Xt- The answer is as 
follows: According to the existence theorem 43.2A, Xt exists when 9C is 
the Euclidean line and S is the <r-field of its topological Borel sets. The 
proof based upon the Tulcea theorem 8.3A extends trivially to any 
finite-dimensional Borel space or Borel subset thereof and, in fact, is 
valid in the abstract case below. 

Let 9C be a separable locally compact metric space with metric “d】、 
and let S be its <r-field of topological Borel sets; we shall denote by 
V x {e) the sphere [y： d{x y y) < e]* The proof of the separability existence 
theorem 38.2B remains valid and, thus, there exists a separable for 
closed sets 文 t equivalent to Xt’ hence Markovian with same tr.pr. 

We are now ready for our problem. What follows applies to any 
state space of the above nature and is couched in corresponding terms. 
In particular, if x n goes out of any compact 尺 C 9C，we write 〜一 ^ 00 
and denote by 9C + j 00 } the one point compactification of 9C (in the case 
of a Borel line, “oo” denotes “±oo” lumped together). However, as 
usual, to fix the ideas, we take (9C, S) to be the Borel line，without further 
comment. The results are essentially those of Kinney (with modifica¬ 
tions due to Blumenthal and Maruyama). We emphasize the methods: 
the direct method and the martingales method. 

From now on, Xt is Borel stationary Markovian and separated for 
closed sets, with tr.pr. S) and semi-group {T h t ^ 0). As usual, 
the limits in h are taken as h —^0. 

a. Sample limits lemma. Let Tf,g{x) — g(x) for every x 1 C 9C and 

every g C Co¬ 
if the Tug C C or the convergence is uniform in x ^ K for every com¬ 
pact K、 then almost all sample functions of Xr have at any time at most one 
left and one right limit value belonging to 9C. 

If the convergence is uniform in x G 9C, then almost all sample functions 
of Xt have at any time left and right limits belonging lo 9C or 9C+ {°o} 
according as 9C is compact or not). 

Proof. It suffices to give the proof for sample functions which start 
at any given C 9C, that is, on the pr. space (ft, ft, P x )» Because of 
separability of A r 『，all limits along [0, oo) may and will be taken along 
some fixed countable separating set，without further comment. Take 
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1。 Let 《C Q). By theorem B, (一 e— xt R\g(X t )，t ^ 0) is a semi- 
martingale. Since it is bounded, the martingales limits theorem (39. 1C) 
applies. Thus, neglecting a null event throughout the rest of the proof, 
at any time all sample functions R\g(Xr(o))) have left and right limits 
for any g and any rational X > 0. Since 

广 《 r 00 

X/?x 《 (*v) = I \e^ u T t g(x) dt = I e^ u T u i\g{x) du, 

it follows that, as X oo, \R\g(x) g(x) for every x or uniformly in 
x C. K y according as Thg(x) —» 畧 ( 欠 ） for every x or uniformly in ^ C 
Let, say, 〆 T /; what follows applies as well to t f | /. 

2° Suppose that has distinct limit values 〆 x n belonging 

to 9C. 

Let the Thg G C. There is a C Q such that g(x f ) 9^ 畧 (〆’)， hence 
\R\g(x f ) 7^ \R\g(x f/ ) for a sufficiently large rational X. This contradicts 
the existence of a unique limit value 

Let the convergence be uniform \r\ x ^ K for every compact K. There is 
a C Q with g = 1 on K f and g = 0 on K ”， where K f and K n are dis¬ 
joint compact neighborhoods of x f and x”, respectively. Thus, given 
€ > 0， for a sufficient large rational X independent of C 尺 ’ + 尺’’， 

I \R\g(x) — g(x) I < 6, while there are two sequences /’ n ， t n n | / such 
that g(x ； n) —> 1, g{x n n) —> 0, where x f n = Xr n (o)) y x /f n = X t ^ n (o)). 
Therefore, as w > oo, then € —> 0, 

1—1 g( x； n) — g{^ n n) I ^ I \R\g(X’n) - >iR\g(X〃 n ) | + 2c —> 0. 
and we reach a contradiction. 

Finally, let the convergence be uniform in .v C 9C* Upon compactifying 
9C as in 45.3d, the preceding case applies (a fortiori, if 9C is already com¬ 
pact). The proof is terminated. 

The first part of a yields (use 45.3c for the first hypothesis) 

a/. For every e > 0, let Ph(^ y ^(e)) 1 for every ^ G 9C and C be 

invariant^ or let V x {^)) ^ 1 uniformly in x C. K for every com¬ 

pact K. 

Then、for almost all o) C. ^ and all / > 0, 似 〆 T / and as 〆 丄 / ， Xt»{io) 
converges to some x C 9C+{°°} or X^(oj) has two limit values: x ^ 
and oo. 

When 9C is compact, the point at infinity disappears. Otherwise to 
eliminate the point at infinity and, thus, have at any time left and 
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right limits belonging to 9C, it suffices that the sample functions be 
bounded. In fact, uniform stability at infinity suffices: 

b- If P t (x y K) —0 as x co uniformly in t on every finite time in¬ 
terval for every compact K, then almost all sample functions of X T are 
bounded on every finite time interval. 

If 9C is compact, then the sample functions are necessarily bounded, 
and we interpret the hypothesis as trivially true. 

PtooJ. Let / > 0 and compact K n T 9C. Suppose there are on [0, /] 
unbounded sample functions corresponding to an event of pr. 8. It 
suffices to prove that P[X t G 尺 】 S 1 — 5 for any given compact K, for 

tHen 1 一 S ^ P[X t G K n ) ^ P[X t G 9C] = 1 

and 5 = 0. Let q n be the supremum of P s (x，K) over all j ^ / and over 
all ^ C ^n ； by hypothesis, q n 0. 

Let Iq < * * * < / m be points of an arbitrary finite set T f in [0, /]• Given 
K ny let r(co) = ij where tj is the first of these points for which [X tj (u)) C 
K n ] or t(w) = oo if there are no such points. Thus 

[r < oo] = [X tj C K n for some/y C 『]， [r = oo] = [X t . C K n for all/^C T f ). 

The simple time r is a time of X Ty hence is its Markov time. Therefore, 


and 


P[t < Xt K] = E{I[ T< ^P t ^ T (X Ty K)} ^ q n P{^ < °°] 

P[^t G ^ P[r = °o] + y n . 


Apply the standard separability procedure: take a sequence of finite 
subsets T f of a separating set of X[ 0tt ] converging increasingly to this 
set. It follows that 

P[Xt C 尺 ] S P[X S C + 1 — 5 + y n 1 一 5 ， 

and the assertion is proved. The proof is terminated. 

Remark. The above method of proof is direct. But we may also 
use the martingales method, as follows: For positive g C C 0 , 


Ttg(x) = I P t (x y dy)g{y) ^ cP t (x y K) + sup g(y) 

^k-\-k c v€ K 

so that, by hypothesis ， T t g(x) 0 as x co. It follows that the 
unbounded sample functions on [0, a] correspond to the event A = 
[inf R\g(Xt) = 0] and, by 39.1b(ii) applied to the submartingale 

t^a K 、 
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formed by 一 e- At R\g(X t )，J^ Xa R x g(X a )^P - 0. The integrand is 
positive so that PA = 0. 

Under the condition of b and either one of the conditions of a' almost 
all sample functions of Xt have at any time left and right limits belong¬ 
ing to 9C. Since Xt is separable for closed sets, almost all sample func¬ 
tions are continuous except for countable sets of strict jumps. Further¬ 
more, the weakest condition: Ph(^ y 厂 x(€)) — > 1 for every .v C 9C and 
every € > 0 implies that Xt (that is, every r.f. belonging to it) is right 
continuous in pr” since for every x 

P x [d{X iy X t+h ) ^ e] = J P t (x，dy)P h {y, F y \t)) ^ 0, 

hence for every initial distribution P 0 

P[d{X h X t+h ) ^ e] =J P 0 (dx)P 气 d(X h X t ^. h ) g 6】 — 0. 

Thus, if S separates Xt and we set X t — X t for / G ^ and X t = 

lim C S) for / C (^t = lini X v for t ^ S) y then Xt separated 
v i t t* 1 1 

for closed sets by S is equivalent to Xt. Hence, Xt has same tr.pr, and 
almost all its sample functions are right (left) continuous except perhaps 
at those points of S which are fixed discontinuity points of Xt (where 
X t (o)) may coincide with 足 + 0 (w) or with according to the 

choice of oj); to eliminate this last obstacle for sample right (left) con¬ 
tinuity, we may replace the X t by the ^ + o(^-o) and, in fact, a.s. 
sample continuity may also be imposed upon Xt using 38.3A and the 
particular cases which follow it. However, it will be more instructive 
to give a direct answer (which overlaps the preceding lemmas) to the 
problem of sample right (left) continuity. 

C. Markov sample rightcontinuity theorem. Let a separated for 
closed sets Xt be stationary Markovian with tr.pr. Pt(x，S). Let either of 
the two following conditions hold: as h — Q 

(i) P k (x，V z {^)) ^ 1 uniformly m *v C 9C for every € > 0 

(ii) P h (x，V x {t)) —> 1 uniformly in x G K for every compact K and 

every e > 0, and P t (x y K) 0 as x uniformly in t on every finite 

time interval for every compact K. 

Then, Xt is a.s. continuous^ almost all its sample functions are continuous 
except for countable sets of strict jumps y and there exists a separated for 
closed sets equivalent Xt with almost all sample functions right {left) con¬ 
tinuous and left {right) limits for every t. 
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P^oof. Let T f be a time subset with first element / 0 . Let e > 0 and 
々 = 1， … ，”. Let oj C A n {T f ) <=> Xt(co) has n oscillations greater than 
6 on T \ that is，there exist n pairs (〆*，/"*) of points t f k < t n k ^ t f k+l 
of 7^ such that d{X t ， X t ^ k {(xi)) > e. For every co let ro(w) = / 0 
and let r*(w) be the first point/of T following for which d(X t (o)) y 

^r*_ 】 (w)(w)) > e/2 or Tjt(w) = oo if there is no such point. We have 

^n{T f ) d B n {T f ) = [ro < •. • < T n < 00] 

and if T is a finite set，then the r* are simple times of Xt 、 hence are its 
Markov times. 

1 。 Suppose (i) holds: Given 6, 5 > 0 ， there exists an A = h(e y 5) > 0 
such that Ph，(x，V x c {t/A)) < 8 for all h f % h and all ^ C 2C- Let 
I = [a y b] be an interval of length h, set P T X C = P(C\ X T = x) y and let 
Pn be the supremum of P To x [r 0 < •. • < r n < o>】over all finite sets 
V (Z I and over all ^ C 9C. For any such V 

戶 r 0 Z [T0 <•••<〜< °°]= 五 W[r 0 < n <«1 戶 (Tl < • • • < T n < 00 | X T{ ) } 
so that p n ^ p n 一 \p\ and, by induction, p n ^ p\ n . Furthermore, 

戶 r ， [T 0 < T! < O), G ^(6/4)] 

^ 尽 m<n< Wb C 〜 Tl (6/4) I X Tl )} < 5, 

so that 

戶 r ， [T 0 < T! < oo] < 5 + P T -[X h C K(e/4)] < 25, 

hence p\ < 25 and p n < (25) n , Therefore, upon applying the standard 
separability procedure, 

PA n {l) ^ PB n (I) < (25) n . 

Thus, taking n = l y 

P [sup d(X t 、 Xr) > e] < 25, 

and a y e y 8 > 0 being arbitrary, it follows that X T is a.s. continuous. On 
the other hand, taking 5 < |， 

00 00 

Z P^niD ^ Z PBn(D < Z (25)- < 00 

flaal 

so that, by the Borel-Cantelli lemma, on / almost all sample functions of 
Xt have only a finite number of oscillations greater than e and of con¬ 
secutive oscillations greater than e/2. Since every finite time interval 
[0, c] is covered by a finite number of intervals of positive length h y it 
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follows that on [0, c] almost all sample functions of Xt are bounded and 
have left and right limits. The equivalence assertion results then from 
the discussion which precedes the theorem. 

2° Suppose (ii) holds. In fact, in lieu of its second part, we suppose 
only the boundedness conclusion of b. The preceding argument applies 
but with restrictions to suitable compacts. First, given e, 5 > 0 and 
[0, c] y take c f > c and select a compact K so that the set of sample func¬ 
tions which do not stay in K on [0, r’] corresponds to an event of pr. less 
than 5 (thus P[X t C 尺】 > 1 — 6 for all t ^ c f ); then take a compact K f 
containing all V x {t/2) y x C. K (decreasing e if necessary). There exists 
an A = h(e y 5, K f ) > 0 such that /VCv ， Vx{^/^)) < 5 for all h! ^ h and 
all x C decrease h if necessary so that h < c f — c and a finite number 
of intervals / C [0, 〆】 covers [0, c]. 

Let p n be the supremum of P TQ x [ro < • • • < T n < oo, X Tq C 尺， • • •， 
X Tn _ x C over all finite sets T f d I and over all x K. Upon pro¬ 
ceeding as in (i)，the relation p n ^ pi n ^ still valid. Similarly, for x K y 

P n x [r 0 < T! < oo, ^ T0 e K y X rx c K\ X h c ^x(6/4)] < b 

so that 

Pt 0 X Wo < T1 < 00 , X T0 G K] 

< 6 + Pr 0 x [X Tl c K^] + P To x [X b c K(e/4)] < 36, 
hence pi < 35 and p n < (35) n . 

Upon proceeding as in (i) and using P[r 0 < ri < oo] ^ P[tq < ti < oo, 
X To C 尺 ] + P[X。C 尺 ] h follows that 

P[ sup d(X t ，， Xt ff ) > e] ^ P\P[^a C 尺 ] + P[^a C 尺 ] < 45 ， 

t f ,r ci 

and Xt is a.s. continuous on [0, c]; since c is arbitrarily large, Xt is a.s. 
continuous. Similarly，on [0〆]，almost all of those sample functions 
which stay in 人 ’ have left and right limits; since the others correspond to 
an event of pr. less than 5 and 8 is arbitrarily small, the restriction to K 
can be removed. The equivalence assertion follows. Quasileft con¬ 
tinuity, page 383, completes this subsection. 

46.2. Extended infinitesimal operator. Let Xt = {X ty / ^ 0) be 
stationary Markovian with Borel tr.pr. Pti^y S) and corresponding Borel 
semi-group (T h t ^ 0) on the space G of bounded measurable functions 
on a measurable state space with metric < W. M Let D on Gd and D on 
G f d be the strong and the weak infinitesimal operators of the semi-group 
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(we also say M of Xt')* We intend to solve in D’g the integral relation 

(46. IB) 

E x g{X r ) - g(x) = E x fu g {X t ) dt 

where t is a Markov time of Xt with E x r < oo, and U can be replaced 
by D. 

If r is an ordinary time h y then, dividing both sides by h and letting 
々 一 > 0, we fall back upon the definition of D\ Yet, this is a ''ready-to- 
wear” approach, in the sense that sample properties of Xt are not taken 
into account. On the other hand, if we use a random time r of Xt 
defined in terms of its sample properties, the approach is fitted to the 
process — it is “made to measure.” The analogy with Riemann versus 
Lebesgue integration is visible but not adequate. For, r must be a 
Markov time of Xt and the approach becomes restricted to strong 
Markov processes. Furthermore, loosely speaking, the tailored r would 
be the time Xt spends in smaller and smaller neighborhoods of x. To 
be precise, we take for r the time r v that Xt takes to hit the comple¬ 
ment of an open neighborhood U of x and let its diameter | U \ — > 0. 

According to 43.4c, r v is a time of Xr when Xt is sample rightcon- 
tinuous. Thus from now on, Xt is strongly Markovian and sample right 

continuous. 

The integral relation also requires that E x r u be finite. The simple 
conditions below will suffice. 

Let Uy with or without subscripts, denote open sets and, given C /。， 
set m{y) = y 

a. If there exist j > 0, 5 > 0 such that P x It Uq ^ s] > 8 for all x C 
then m(x) = E x t Uq < s/8 < oo for all x G If x G U C ： Uq and 
m(x) < oo, then E z r v < °o and m(x) = E x r v + E x m(X TU ). 

Proof. Let A t = [r^ > /]. Since = At(A s ) t y where is the 
translate by / of and X t {o)) C. U for co C. 杰 ， the first hypothesis 
yields 

P x Jt+s = E x (I At P Xt ^s) < (1 一 h)P x A t . 

It follows, by induction, that P x A ns < (1 一 5) n and the first assertion is 
proved by 

oo p (n+l)« 00 

E x r Uo ^Z I P X K >t]dt <sT, P x An, < s/8. 

nasO 
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Since U C U 0y it follows that r v ^ t Uq and the translate by r v ofr^is 

r c/ 0 一 7 V Therefore, by the second hypothesis, E z r v ^ E x r v = m(x) 
<oo and °* 


E x (r v 


hence 


0 


T u) = E x (E Xt ut U q ) = E x m(X Tu ) y 


m ( x ) = + 五 z ( T c / 0 — T c) = ^ Xt u + E x ni{X r ), 


The second assertion is proved, and the proof is terminated. 

Let U denote open neighborhoods of and set 
~ . E x g(X r ) - g(x ). 

Dg(x) = lim --- if E x tu < oo for some U y 

1" 卜 0 E x r v 

Dg(x) = 0 if £% = oo for all U; 

it suffices to use the first form with the convention that the ratio therein 
is 0 when E x t v = oo. 

Denote by Gd the set of those functions G G for which Dg exists and 
belongs to G. We say that D on Gd is the extended infinitesimal operator 
of Xt or of its semi-group. Note that by a, setting m(z) = E z t uo , 


U C U 0y and P Tu (x y S) = P x [X ru C S] hence T ru g(x) = j P Tu (x y dy)g{y) y 


we 


have 


E x g{X ru ) - g(x) T ru g(x) - g(x) 


^Pr v ^ydy){g{y) - g(x)\ 


E z r u r 

JP Tu (x y dy){m(y) - m{x)\ 

with the same convention as above. 

According to the integral relation, when some E x r v < oo then 


E x Ti 


(C) Dg(x) = lim U Ug{X t )dt y gCG^ dy 

in the sense that if either of the sides exists so does the other and both are 
equal. Thus, if the right side is D f g(x) then Dg(x) = D f g(x) and our 
problem may be enlarged into a search for conditions under which the 
last equality holds, whether or not E x r u are finite. 

The equality is trivially true when x is an absorbing state, that is, al¬ 
most all sample functions which start at x stay there forever. For then, 
E x r v = oo fpr all U and Dg(x) = 0 by definition, while P x [Xt = = 1 

for all t implies that Ttg{x) = g(x) for all t and D^g(x) = 0 by definition. 






[Sec. 46]_MARKOV PROCESSES 369 

In fact, we may expect the equality to hold when almost all sample func¬ 
tions stay at x for some positive time. Similarly when D f g\s continuous 
at 夂 provided E x r v < oo for some U. For then, given € > 0, | D f g(y) 
一 Djb) I < e for all y U with | U | sufficiently small, hence 
I 乃 ’ 兄 (^Q — I < for / < r v and 



iP f g{X t ) - Ug{x)) dt 


< eE x r, 


However, we are concerned with the process and not with some of its 
r.f.’s such as those which start at some specified state. Then the re¬ 
quired continuity of D’g leads to considering “stable processes” and the 
required time interval of constancy leads to considering “jump processes •” 
We intend to show that in either case, the term ‘‘extended infinitesimal 
operator’，is justified, in the sense that D 3 D\ that is, the domain of D 
underj:onsideration contains that of D f ; since always U 3 D y we shall 
have D D 3 D. 

We say that a stationary Markovian process Xt is a jump process if, 
for every w C Q and / g 0, there exists an A 0 > 0 such that X t (o ))= 
义 十厶 (w) for 0 ^ h K hQy equivalently, if Xt is sample rightcontinuous 
in the discrete topology (in the state space). Since our concern here is 
with stationary Markov processes, we reserve the term “jump process” 
for such processes. It follows at once from the definition that for a 
jump process Xt 


尸 “ 方， W) — 1 ， Phi^yS) —» I(x y S) y 

P t ^ h (x y S) P t (x y S) y T t+h g(x) T t g(x )， 

and Xr and its tr.pr. are Borelian. Furthermore, according to 46.1A(2°), 
Xt Is strongly Markovian. 

Let r\ be the first jump time of Xr ： ri(a?) is the time X t (o)) first hits the 
complement of the singleton {Xo(w) j — the smallest open neighborhood 
of-the state 义 o(w) in the discrete topology. The time r\ of the stationary 
strongly Markovian Xt is its stationary Markov time and At^ = 
where ^ = [r x > /]. Since X s (w) = X 0 (co) forco G it follows 

that 


P x A t ^ = E x (I Ai P x -J t ) = P x A t P x A &y 

and the nonnegative bounded nonincreasing function p x (t) = P x At 
obeys the classical functional equation p x {t + j) = p x {t)p x {s) y /, J ^ 0. 
Therefore, P x [r\ > /] = e~ qMt y q{x) ^ 0, and q{x) < oo since q{x )= 
oo <=> P x [r\ > 0] = 0 Cv is instantaneous) —— contrary to the definition of 
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a jump process. Thus E x r x = \/ q {x) > 0 and E x r x = oo ㈡ q ( x ) = 0 
㈡* v is absorbing. To summarize 

b. If Xt is a jump process and r x is its first jump time、then 

P x [ri >/] = 「州 ， 0 “Cv) < oo, 9C, /gO ， 

an d = \/q{x) > 0 is infinite or finite according as x is absorbinz or 
is not. 

In the discrete topology, with metric d defined by d{x y y) = 0 or 1 
according as j or x 1 〆 j，every set (7 3 x is an open neighborhood 
of 夂 and its diameter | U \ = 0 or 1 according as U reduces to the single- 
ton {x} or has other points besides x. Therefore, we have to set tjj = rx 
in the definition of D and, using b, it becomes 

On the other hand, since X t = for / < r u 

9 (收 f 1 D ， g(X t ) dt = Ug{x), gCG^ d . 

J o 

Thus 

^g( x ) = D’g(x)，g C G f d 
(whether *v is absorbing or not), and 

A. Jump processes extension theorem. If X T is a jump process 
then D ID D / ZD Z) and、for every C 9C> 

Dg(x) = q(x) JP T{ (x, dy) {g(y) - g(x)\, g CG d ^> G f d Z> G d . 

Note that for jump processes D is an integral operator，and if the func¬ 
tion y(*) is bounded then Gd = G. 

Stable stationary Markov processes have been defined and investigated 
in 45.3 : Their semi-groups (T“ / ^ 0) leave the space C invariant, that 
is, transform bounded continuous functions g into bounded continuous 
functions T t g ， for every /. It suffices to consider these semi-groups on 
the invariant subspace C. Thus, the domains of D, D\ D are to be re¬ 
placed by their intersections C" ， C f 如 Cd with C. In defining Z), we re¬ 
stricted ourselves to stationary Markov processes with Borel tr.pr, and 
sample rightcontinuity, for short to right continuous processes. We 
also required strong Markov property, but (as in the case of jump 
processes) this requirement is superfluous in the case of stable processes; 
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according to 41.1A(1 )，rightcontinuous stable processes are strongly 
Markovian. Since the domains of the corresponding semi-groups are 
restricted to C, hence all D f g are continuous, the discussion which follows 
the definition of I) shows that then D ] Z)’ ] D，provided E x t v < oo 
for nonabsorbing states x C. U. In fact ° 

c* Let Xt be stable. If P X [X S C! = 6 > 0 for some j > 0 and open 
then C U] > 6/2 Jor some open U 0 3 x and all y C U 0 . 

Let Xt be stable rightcoTitinuous.. If x is tioticibsoTbiHg^ then = 
E v t U() < ^ for some open U 0 3 x and all y d 

Proof ^ttV n = [z ： d{z,U c )^ lMso that V n \ t/and P X [X 8 C Vn\ 
— P x [^s C. U] = 8. Thus, for 5 > 0 there is an n such that P X [X 8 C ^n) 
〉 35/4. For this n y define the bounded continuous function 兄 by 
S(y) = U c ) or 1 according as jy ^ or jy C ^ K X T is stable, 
then g f = T s g is also bounded and continuous and there exists Uq 3 x 
such that g f (y) > g f {x) - 5/4 for j C U ih Then, from 

S f ( x ) = JPsi^y dy)g{y) 

and the definition of g y it follows that for y C U 0 . 

P y [X s CU}> g\y) > g\ x ) - 5/4 > P x [X 8 c Vn\ - 5/4 > 5/2. 
The first assertion is proved. 

Let Xt be stable right continuous. If x is nonabsorbing, then there 
exists an open U with d(x y U) > 0 and P X [X S C. U] = 8 > 0 for some 
j > 0 and, by the first assertion, there exists an open C7o 3 x that we 
can take disjoint from U y such that P y [X 8 G > 5/2 for all y C ^o- 
Therefore, for all y C 「 o ， 

P y [rv Q >s}< py[X 9 CU 0 )< \ - 8/2 

and, by a, m(y) = 五 "〜。 < oo. The second assertion is proved, and the 
proof is terminated. 

Let m{z) = E z T Vo y U C (7 0) and recall that P ru (XyS) = P x [X Tv C ^]* 

B, Rightcontinuous stable processes extension theorem. 1 / 
Xt is stable right continuous y then D D 3 D and 

f p r v (x, dy) {g(y) - g(x )} 

Dg(x) = 一 lim -- - g £2 Cdy 

in—o 厂 

J p r v {^ydy){m{y) - m(x)} 
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or Dg(x) = 0 according as x is nonabsorbing or absorbing; if the state space 
is compact、then D = D f = D. 

Proof. The inclusion and limit assertions result from the discussion 
which follows the definition of D and the restriction of the semi-group 
on C. 

To prove the last relation, note that, by 40.3B and d, = D and 
C f c = C c = C. Since D D D\ it suffices to prove that D C D\ that is, 
Q C CV Let g C. Cd and set 

f = (/ - D)g y g f = /?i| = f e^T t gdt. 

Since g C. C = C cy it follows that g f C C f d and f = (/ 一 D f )g f . Thus, 
f = (/ 一 D)g f because of D D D ; . Therefore, 

(/ - Z))/ = o, f = g - g'- 

9C being compact,/ attains its supremum on 9C for some x 0 and the defin¬ 
ing relation for D implies that D/(xq) ^ 0 , hence / ^ 0 ; similarly, 
—/ ^ 0 . Thus,/ = 0 , that is, g = g' C C'd- The proof is terminated. 

We say that Xr is a continuous stable process if it is stable rightcon- 
tinuous and also sample leftcontinuous. The sample functions of a con¬ 
tinuous stable process being continuous, the time ru is the time Xr first 
reaches the closed set U c . Therefore, X Tu belongs to the boundary U f 
of U and the integrals in B can be taken over U f only: 


Dg(x) = 一 lim 

I c/I — o 




Pr^y dy) {g(y) - g(x) j 
u ， 


P ru (x y dy)[m(y) - m(x)} 

w 


This expression is similar to the one which gives the ordinary Laplacian 
operator in terms of averages on spherical surfaces, except that there 
the averaging is with respect to a uniformly distributed measure. This 
leads to considering the foregoing operator as a generalized elliptic dif¬ 
ferential operator of second order or, in physical terms, a {general) 
diffusion operator^ to be denoted by 3D. According to the convention 
made, we have, for every x C 9C and g C Cdy 

^ g (x) = lim — 7 - f P Tu {^ydy){g{y) - g {^)) - 
\u\-*o E r v Jv 

It follows that if 《 = 〆 in a neighborhood of ^ then ^>g(x) ^ 抑 ’ W，and 
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^ g attains a relative minimum at x，then 3^g(x) ^ 0. The foregoing 
terminology is further justified because, on sufficiently smooth functions, 
© can be written as an ordinary elliptic differential operator, as follows. 

CL Diffusion operators theorem. Let ^ be u diffusion opevatov citid 

^ et Sky SjSkyj = 1> •••，”，belong to the domain Cd of D, in some neighborhood 
of a state x. 

UAyu # • *>^n) is twice continuously differentiable in a neighborhood of 
CfiOO，. • # then ©/CfiOO，• • • ，尺 nW) exists and equals 


d 2 f 


二 df n 

H ^ - H 5Z ^jk 

Og k j 、 k_l dgjdg k 

where the derivatives are taken at (g x (x) y • • • ， gnCv ))， 

a k = ^(Sk - gk(^))(x) f b jk = 3 )( 幻一 gj(x))(g k - g k (x))(x) 
and the bjk form a nonnegative type matrix. 

Proof. We have 


Asu … 小）一 AfiW)，• • • ，兄 nW) 


k ogk j，k dgjdgk 


(1 + 8jk)hjhk 


where the derivatives are taken at (g x (x) y - - -yg n (x)) y h k = g k - g k (x) 
and Sjkix') = 8j k (g\(x f ) y 兄 n ^O) 0 as the hk 0, hence as 

x f x. Thus, the ratio in the defining expression of 3D can be written 
as a sum of three terms: As | C7 | — 0, the first term 


1 

k dg k E x t 



Pr^y dy)h k {y) 




k dg k 


the second term 


a 2 / 


T 

j、k dgjdgk E x t u J v 


Pr^y dy)hj{y)h k {y) 


j、k dgjdgk 


and the squares of the summands of the third term 


E x t 


u 



Pr^y dy)hj{y)h k {y)b jk {y) 
ip 


2 




max I 8 jk (y) \ 

y€U f 


2 


E x r 





Pr^y dy)h 2 j{y) 


E x Tjj J u 


Pr^y dy)h 2 k {y) 
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converge to 0 , since the first factor converges to zero and the two others 
are bounded. Finally 


Z = 3D(Z ^khk(x)) 2 ^ 0 , 

j，k k 


since the function on which 3D operates attains a relative minimum at x. 
The proof is terminated. 

To conclude this general discussion of the extended infinitesimal op¬ 
erator, let us mention that whenever it is a true extension of the weak 
infinitesimal operator, the last one is obtained by supplying further in¬ 
formation. This information usually takes the form of boundary con¬ 
ditions. Thus, loosely speaking, the extended infinitesimal operator 
describes the behavior of the process before it reaches the boundary of 
the domain on which it is considered. 


46.3. One-dimensional diffusion operator. In the one-dimensional 
case, the diffusion operator takes a specific differential form, and we 
proceed to establish this fundamental result of Feller, following Dynkin. 

Let Xt be continuous stable with a one~dimensional interval state 
space 9C = [a, 0\ and the or-field S of topological Borel sets in it. We 
take U = ( 乂 1 ，乂 2 ) 13 its boundary consists of the two endpoints to 
which correspondjhe pr.’s p Xi = P x [X T(XuX2) = x x ] and p X2 = P x [X T{xuXi) 
= x 2 ). Let g G Cd- We know that 

\. If x is absorbing then © = 0. 

If x 0 is not absorbing, we can select (x[ y x f 2 ) 3 so that, for all 
x C (^i, ^ 2 )) = E x T( X [y 2 ) is finite and 


(3Di) = —lim 


PxyigM - g(x)) + px 2 (g(x 2 ) - gM) 


^•70 p xi (m(xi) — m{x)) + p X2 {m{x 2 ) - m{x)) 

x-f 0 


Formally 、upon applying L’Hospital’s rule, we obtain 




_PM_ 

p{x)m n {x) + 2p f {x)m f {x) 




2p f {x) 


-4- 




so that 3D appears as a differential operator of second order (which may 
degenerate). In fact, we shall establish that 3D is a generalized differen¬ 
tial operator. For this purpose, we classify the states as follows: 

x is a right passage point or a left passage point if P x [X t > x] > 0 or 
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P x [^t < -v] > 0 for some /， and x is a passage point if it is a right and a 
left passage point. Note that if x is neither a right nor a left passage 
point, then it is absorbing. 

II. /^/ one-sided passage points y SD is a generalized differential operator 
of first order. 

If x is a right but not a left passage point then p Xl = 0 and 


= -lim ㈣ 二少） 

x 2 ~>x+o m(x 2 ) — m(x) 


一 


If x is a left but not a right passage point then p X2 = 0 and 


= — lini 


gM - g(x) 


x\ m(xi) 一 m(x) 


D m ^g(x). 


To find the differential form in the case of (two-sided) passage points, 
we require more information about the properties of such points. The 
arguments to be used are similar to those which preceded the one¬ 
dimensional case and, therefore, will be shortened. 

Let Ty(co) == inf [/: X t (u)) = y) if this set is not empty and T y (co) = oo 
otherwise. Thus ， r y (co) = T[ afV ) or T( y ^] according as X 0 (a)) < j or 
X 0 ((jo) > y; note that the subscript sets are open in 9C = [a, /?]. Denote 
by p(^ y y) = P x [r y < qo] the pr. starting at x to reach ^ in a finite time, 
and denote by p(x y y y z) = P x [r y < r z ] the pr. starting at x 1 to reach y 
before reaching z. Note that p(x y y y z) + p(x y z y y) ^ 1 and that x is 
not a right passage point if and only if p(x y y) = 0 for all jy > 

a* (i) If a < x < y < b or a > x > y > then 

P(y， a ) = P(y ， x)p{x, a) y p{y,a,b) = p(y y x y b)p{x y a, b) y 

p(x y a y b) = 0 ==> p{x y a) = 0 . 

(ii) If x is a right passage point、then there exists (x\ x n ) 3 x such that 
p{x\ x ’’） > 0, and if all x C b] are right passage points then p(a y b) > 0. 

Proof. I 0 Let, say, a < x < y < b. Since 

[Z 0 = r a < oo] = [^f 0 = ^, < oo] fl [r a < ^] Tx 

and X Tx = x, it follows that 

P y [r a < °o] = E y {I[r x< ^]P Xr ^a < °°]) = P x [r a < ^)P v [r x < °°]. 

This proves the first equality in (i), and similarly for the second equality. 

Let a < x f < x < x n < b. Let S f be the set of states z f < x f and let 
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S n be the set of states 2 " > x"• We say that Xt(co) crosses exactly n 

times from S n into S f in time / if there exist n and only n pairs (/’ 、 /’a ；) 

with 0 $ /’、 < /’i < /’’2 <•••<〆《$/ such that X t » k C S' and X^ k C 

Let co C / ㈡ Z 0 (o)) = x and r a (co) 〈⑺， and let co C ^=> ^ C ^ 

and Xt(<jo) crosses exactly n times from S n into S f in time r a (o)), so that 
00 

A = \J A n . Let r n (co) = inf [/: X t (o)) = .v and X T {(^) crosses exactly 

Yl ^bQ 

n times frpm into in time /] if this set is not empty and r n (co) = 00 
otherwise, so that A n C {A 0 ) Tn . If p(x y a, b) = P X A 0 = 0, then, from 
X Tn = it follows that P x {A 0 ) Tn = 0, hence P x A n = 0 and P X A = 0. 

This proves the last assertion in (i). 

2° If a: is a right passage point, that is, P x [X t > x] > 0 for some /， 
then there exists an x" > x such that P x [X t > x n ) > 0, hence there 

exists a neighborhood (x - €， x + €) such that P v [X t > x n ] > 0 for all 

J C (x — € ， a: + €)• Therefore, p(x\ x n ) > 0 for any 〆 C (x — e ， x). 
This proves the first assertion in (ii). If all x C are right passage 
points then, by what precedes, there exist intervals (x\ x n ) 3 x such that 
p(x\ x n ) > 0 . These open intervals cover [a y b\ y hence a finite number of 
them (x 、， x” k )，k ^ n y covers [a, b]. It follows from the first equality 
in (i) that p(a y ^) > 0 and the second assertion in (ii) is proved. The 
proof is terminated. 

b- Let p(a y b) > 0. If a < b then all x G [a 9 b) are right passage points y 
all E x T( a ,b) are finite、and p(x y b 、 d) — 1 as x — ^ — 0; if a > b then 
p(x y b y a) — \ as x ^ + 0. 

Proof. Let x C [a y b). If x is not a right passage point then, by a, 
p(d ，々 ）= p{a^ x)p(x y ^)=0 contrary to the hypothesis. The first as¬ 
sertion is proved. 

Since for X 0 = y with y C [a, x] 

[r b > t] u C [r b > /], X Tx = r (afb) = min (r a , r h ) ^ r hy 
it follows that 

P x [r (a ,b) > /] ^ P x [n > /] = £°( 尸 XT *[n > /]) s Ptn > 4 
where, letting / —> 

P a [r h > /] — f^Wb = 00 】 =1 — p(a y i?) < 1. 

Therefore, for some /， 

P x [r(a t b) > t] ^ P a \Tb > /] = 1 - 5 < 1 
and, by 41 . 2 a, E x r {atb ) < This proves the second assertion. 
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Let ^ < x < X\ < • • • < T ^ and X 0 = a:, hence r Xl < * • • < r Xn T r 

^ rb. Set A = [rb < r a ] and A n = [r Xn < r a ] so that C As¬ 

sume there is an coC 一 A. Either r(o)) < oo or r(co) = oo. But, by 
sample continuity, on [r < oo], X n — b x n = X Tn — X” hence r = r^. 
Therefore r(co) < oo implies to A which contradicts the above assump¬ 
tion. Thus r(co) = so that r a (co) ^ T6(a))^r(co) = oo, since w C It 

follows that T( a ,6) = 00 while, by a, p(a y i?) > 0 implies E x T( a ,b) < °°. 

Therefore A n — ^)=0, 

p(x y x ni a) = P x A n P X A = p(x y b y a) = p(x y x ny a)p(x ny b, a) 

zx\dp{x y b y a) > 0 , since otherwise 厶 ） = 0 andp(^, b) = />( 这，乂 )/>( 乂，彡） 
= 0 . It follows that p[x n ， b y a) —> 1 , and similarly when b < a. This 
proves the last assertion. The proof is terminated. 

c. Let p{a y b 、 p{b 、 ^) > 0 and let x vary over [a y b], 

(i) The function p(x) = p(x y b y a) is continuous and increasing with 
p(x) -^0 as x—>a + 0 and p(x) — 1 aj x — 彡一 0 , and 

p(x y X\y X2) = (/>( 乂 2) — pC^))/CpC^ 2 ) — p{ x l))i u X\ < X < X*2 < ;• 

(ii) The function —m(x) = 一 E x T( a ,b) is continuous and、with respect 
to the junction p(x) y it is convex and has an increasing left continuous left 
derivative s^(x) = 一 D v —m 、 x、and an increasing right continuous right 
derivative j+(;c) = —D p ~^m(x) which coincide on their continuity set C(s)* 

Note that, if the hypothesis holds then all points of (a y b) are passage 
points and if all points of [a y b\ are passage points then the hypothesis 
holds. 

Proof. We use a without further comment. 

1 ° Let a < x < j < 古 ， so that 

p(x) = p(x y b y a) = p(x y y y a)p(y y b y a) = p(x ， y ， a)p(y )， 

hence p(x) ^ p(j). If p(x) = p(y) then either p(x) = 0 or p(x y a y y)= 
1 — p(x y y y a) = 0 . In the last case, p(x y a) = 0 and p{b y a )= 
p(i? y x)p(x y a) = 0 contrary to the hypothesis. In the first case ， 
p(x y 々 ）=0 and p{a,b) = p(a y x)p(x y 々 ）=0 contrary to the hypothesis. 
Thus p{x) < p{y) y and the first assertion in (i) is proved. 

The limit assertions in (i) result from the corresponding ones in b, 
upon using for the first one the relation p(x ， a，》）=1 — p(x). Sim¬ 
ilarly, as a: — 0 , p(x y y y a) 1 , hence 

p(x) = p(x y y y a)p(y) -► p(y) y 
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the function p{x) is left continuous and, analogously, the function 1 — 

p( x ) = b) is rightcontinuous. This proves the continuity asser¬ 

tion in (i). Since 

p(x y x u x 2 ) = p(x y a y x 2 )/p(x u a y x 2 ) = (1 - p(xyX 2y a))/(l - p(x u x 2y a)) 


and 

p(x y x 2y a) = p(x)/p(x 2 ) y p(x u x 2y a) = p{xx)/p{x 2 ) y 

the last assertion in (i) follows. 

2 ° If a g x x < x < x 2 ^ b then 

讲 00 一 x 2 )m{xi) - p(x, x 2y xi)m(x 2 ) = E x r {xuXtl) > 0 

so that, by (i )， 


m(x) > 


㈣ .. :咖 —o + 沙)一片 0 十 2) 

pM - phi) p(x 2 ) — pM 


and the function —m{x) is convex with respect to the function p{x). 
The remaining assertions of (ii) result from this convexity, upon trans¬ 
forming it into ordinary convexity as follows. By (i)，the function p{x) 
has an inverse function q{y). Set n{y) = — w(^(^)) and note that if 
0 <^ 2 ^ 1 then a ^ x x ^ q(y x ) < x < x 2 = q{yi) 4 b, Thus 

, 、/夕 2 一 ） / 、■夕 -夕 i / 、 

”(））< -- n{y 2 ) 

夕 2 — yx 夕 2 —夕 i 


and the function n{y) bounded from above (by 0) is convex. Since 

厂 W = —DjTm{x) = Dy~n{y) y s^(x) = —D p ~^m(x) = D v ^n(y) y 

the assertions follow from the corresponding properties of the function 
n(x). The proof is terminated. 


A* One-dimensional diffusion operator theorem* Let Xt be 
continuous stable and let all x €1 [a y b) be passage points. Let g ^2 Cd and 

m{x) = E x T (a , bh s^(x) = —D p 一 w(x) ， s^(x) = —D p ^m(x). 

On (a y b) y the function D p ^g(x) exists and is left continuous，the junction 
D p ^~g(x) exists and is right continuous、they coincide on C(s) y and 

= D a ^D p -g(x) = D a ^D p ^g(x). 


Proof. We apply C without further comment. Set 


gi^yy) 


g(y) - g( x ) 
p(y) - pW 




m{y) — m(x) 

p(y) - pW 
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Relation (©^ becomes 


⑴ 


Since 


%(,) = -lim 如 i) 

^^2) - ^l) 


lim (m(x y A) 一 m(x ， x x )) 

x i — ► x — 0 

x? ^ x+0 = lim m{x y x 2 ) — lim m(x y x x ) 

Z2 — o xi —♦ x — 0 


S^(x) — S^(x) 


it follows that 


lim si^y *^i) — lim ^ 2 ) = g{x) 一 D p ^~g(x) 

X\ ► X ~0 2*2 ~► X *^0 

exists, 


(2) (s^(x) - s^(x))^g(x) = D p ^g(x) - D p ^(x) 

and the last difference reduces to zero on C(s). Set 


G (y) = g(y) - g(^) - D p ^g(x)(p(y) - p(x)) 

= (p(y) -pW) ( 咖， j) 一 D p ^g(x)) y 

M(J) = -m{y) + 7U{x) - S^(x)(p(y) - p(x)) 

= (p(y) - p(x))(- ；；； (x,^) - 厂 0)). 
Note that M(x) = 0, for y > x 


D p ^M(y) = 厂⑴一厂⑷ > 0 ， D p ^M(y) = s^(y) - 厂 ㈨ > 0, 
and, upon letting ^ x — 0 in (1 )， 


^>g( x ) = Hm 


G(y) 


y — ► x-f 0 M(y) 


To obtain the asserted form of SD 兄 (x)，it will suffice to prove that 
L’Hospital’s rule applies, as follows. First，we show that the limits of 
h^~{y) and h^{y) as j — x + 0 exist and are the same, upon setting 

h + {y) = D p + G(y)/D p + M(y) > k_(y) = D p -G(y)/D p -M(y). 

Suppose that liminf h~^(y) < limsup h^(y) so that there exist distinct 

y —> x + 0 y — ► x *4~ 0 

c\ c n which lie between these limits. If c is either c r or c ft and f{y )= 
G{y) — cM(y) y then D p ~^f = D p +G - cD p ^M changes signs in (x y y) 
for any y > x. Thus, there exist sequences y ny y r n —夂 + 0 such that 
/ attains relative maxima at the y n and relative minima at the y f n . 
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Therefore, ^>/(y n ) ^ 0, SD/(y n ) 2 0 and, from SD/ = — r it follows 

that ^ +(， ^giy^n) ^ +(，hence ^>g(x) = +r. Since c is 

either of two distinct numbers c\ 〆’， this is impossible. Thus, 


lim h^{y) = lim 


D P ^g(y) - D p ^g(x) 


y — ► X + 0 


V 


x+o s^~(y) — 厂⑷ 


exists, and similarly 


v 


lim /T(y) 

- ► z-f 0 


: lim 

y —► z + 0 


Dp^g(y) - D P _gM 

厂 00 一 厂 W 


exists. Since D p ^g(y) = D p ^~g(y) on the everywhere dense set C(s) y 
the two limits coincide. Set now 


F(z) = G(z)M(y) - G(y)M(z) y z C [x y y] 


and note that F(x) = F(y) = 0 so that F attains either its maximum or 
its minimum at some 2’ C ( 久，办 In the first case, D v ^F{z f ) ^ 0 ^ 
D p +F( 2 ’)，hence 


h^{z f ) ^ 


G{y) 

AfOO 


^ A 一 ( 2 ，) 


and in the second case，these inequalities are reversed. As j — x + 0, 
hence 2 ’ — x + 0， the extreme terms converge to the same limit while 
the middle term converges to ^)g(x). Therefore, 


= lim 

y —► z + 0 


Dp^g(y) - Dp^g(^) 
s + (y) - 厂 W 


=lim 

y 一 ► x+ 0 


Dp^g(y) ~~ D P ^g( x ) 

厂 00 -厂 W — 


= D s - + D p ^g(x). 


The function being right continuous, it follows that 

D p ^g(x + 0) - D p ^g(x) = (: +(x) - s^(x))^g(x) 

and, taking into account (2), D p ~^g(x + 0) = D p ^g(x) y that is, D p ~^g is 
right continuous. Similarly for the remaining assertions. The proof is 
terminated. 


Note that the passage points form an open set 52 ( a jy h) an d what 

j ' • 

precedes applies to any [a y b\ C {a^ bj). Furthermore, the foregoing 
form of ^)g can be rewritten as follows: 
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III. If the x C b\ are passage points、then 

= D a D p g(x) y x C C(s) 

D p g、x + 0) - D p g(x — 0) 
s(x + 0) — s(x — 0) 


= 


， ^ C C(s). 


Together, I，II， and III determine the one-dimensional diffusion 
operator. 


COMPLEMENTS AND DETAILS 

Utiless otherwise stated、the state space (9C, S) is u separable locally compact 
metric space, S is its a-field of topological Bore/ sets, and P t (x y x C 9C, 6* C S, 
t > 0 , is a stationary tr.pr. with P t (x y 9C) ^ 1. 

人 P t (x y 9C) is nonincreasing as / > 0 increases. 

2* Let r ^ 0 be a r.v. not necessarily a.s. finite. Let X t be a r.v. defined on 

[t < t], Xt = (X ty / 2 0) is a r.f. of “lifetime r.” 

If Pt(x y 9C) s 1, then there exists a Markov r.f. of infinite lifetime with the 
given tr.pn and an arbitrary initial distribution on S. If P t (x y 9C) ^ 1 and 

Ph(x ， 9C) — 1 as A — 0 for every x y then there exists a Markov r.f. Xt of 

possibly finite lifetime with positive pr” with the given tr.pr. and arbitrary 
initial distribution on S: 

Add an isolated point at infinite “oo” and determine a tr.pr. P\{x f y S f ) y where 
the sets S f are sets S and sets S + so that it coincides with P t (x 9 S) for 
x f = x and S f = S and P f t(x\ {°°}) = 1 or 1 — Pt(x y 9C) according as x*’ = oo or 
〆 # OO. Construct X T as above. Complete (R(X ay s ^ /). Define r(co) as the 
supremum of all rational r for which X r (o)) ^ “turtail” Xr to have lifetime 
r upon replacing the domain of Xt by [t < r] and then replace X t (o)) = oo by 
an arbitrary xo C 9C. 

S, For all (x, S) y if Ph(x y S) —■► I(x y S) as h —> 0， then {1 一 Ph{x y 9C) } / h 
converges to a finite limit: Introduce the point at infinity and P\{x f y S f ) and 
apply section 39.1. 

4. Apply section 39 to Markov processes with a finite number of states under 
the continuity condition and to Markov processes with a countable number of 
states under the uniform continuity condition. 

5, Let Pt{x y S) y / > 0, be S X 3-measurable (3 is the cr-field of Lebesgue sets 

in (0, oo)). .... . . . 

a) Pt(x y S) is continuous in t if and only if there exists a finite measure /x on 
S such that Pt{x y S) is ^-continuous in S: 

If Pt(x 9 S) is continuous in /， take = f e^ l Pt{x y S) dt. If fjL exists, given t 

j 0 

and S note that for 0 < € < / < 〆 and | h n | < e 

I PM-bhn(xyS) - P 9 (x y S) I ds^i{dx) = Jo/v) I P 9+hfi {x y S) - P 9 (x y S) I ds y 

where the inner integral converges to zero as A n and, for some subsequence 
h f ny P, + h f n (x y S) Pt(x y S) for some s C (^, /) a.e* in hence a.e. in P t ^,(x y •)• 
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Thus, 


Pt^n(x y S) =jPt^ 9 (x y dy)P 3 ^ n {y y S) P t (x } S). 


What if the state space is the Borel line and Pt(x y S) = \ or 0 according as 
x + t belongs or not to S ? 

b) Let Pt(x 9 S) be right continuous in /. Then it is continuous in /: Note 
that to find m finite such that Pt(x 9 S) is /x-continuous in S } it suffices to know 
that Pt(x y S) is /-measurable and that if Pt(x y S) = 0 for a.e. /, then Pt(x y S) 
= 0 . 

If Pt(x y S) — ^Pt{x y y)yL{dy) where m is (r-finite and pt{x y y) is (x 9 ^-measurable, 
then Pt(x y S) is continuous in /. 

c) If lim Ph(x y S) exists for all (x y S) y then Pt + (^ y S) = lim Pt^h(x y S) exists, 


a i o 


hio 


is a tr.pr. continuous in /, and coincides with Pt(x y S) except for countably many 
values of /： Note that Pt + (x y S) is right continuous in t and that Pt{x y S) has at 
most countably many discontinuity points in /. 

6. Let Xt be a Poisson r.f. with the sample functions selected to be left 
continuous, and let r(w) be the infimum of those t for which Xt((S) = 1. Then 
the Markov r.f. Xt is stable and r is a time of Xt but is not its Markov time. 

7. Let a ny > 0, p n = a n /(a n + jSJ, qn = 0 n /(ot n + /3 n )« Let X n = (Xi (/)， 

/ > 0), » = L 2, • • •, be independent Markov r.f.’s with two states 0 and 1 ， 
X~(0) = 0, and 

P{X n {t + A) = 1 I X n (t) = 0) = a n h + o{h) y 

P[X n (t + A) = 0 I Xnif) = 1) = + 〆 々 )• 

a) P(Xn(t) = 0 I 义凡 ⑼ = 0) ^ q ny P(X n (s) = 0, / ^ / + A I Xn{t) = 0) 

= e^ anh . If n > o, that is, 2 Pn < 00 then, at every time /, a.s. X n (t) = 0 
for almost all n. 

b) If X n = °o, then P{X n {s) =0,/ for z\\ n ^ m \ X n {t) = 0) 

= 0 for every m. 

LetX = (X(/), t ^ 0) be the joint r.f. (X x (/),^ 2 W, • ••" 20). X is a Markov 
r.f. If S Pn < 00 then X has only a countable number of states. If, moreover, 
Q； n = oo then all these states are instantaneous. 

c) Analytically, let the state space consist of sequences x = (xx y X 2 y . • •)， 
y = (^ 1 ,^ 2 , …)， • • •，of 0’s and l’s with finitely many Ts. Set p/ n) (0, 0)= 

q n + p n e^ at ^ n)t y p< (n) (l, 1) = pn + qne^ (an ^ n)t y P< (n) (0, 1) = 1 一 p< (n) (0, 0), 

pt (n) (\ y 0) = 1 - p 产 )(1,1). The function Pt(x y y) =11 P< (n) (^n, ^n) is a tr.pr. 

. n 

which obeys the continuity condition, and o 户 o(x ， x) = — 0 

8. Let P y Q y with or without affixes, be (pr.) distributions on 9C and set 

d{P, F) = Var (P-P , ) =J| P{dx) - P\dx) |. 

a) d is a metric and the space of distributions is complete in this metric. 

Let X Ty T = z [0, oo)，be stationary Markovian with tr.pr. Pt(x y S) y Pt(x y 9C) 

^ 1, and distributions Pt y P\ (of ^t) corresponding to initial distributions 

(ofXo). L L D ^ D , 

b) d{P iy P\) does not increase as t increases, whatever be r 0 and 尸 o. 
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c) If d(Pt y P) — 0 as / — oo for some Po } then P is invariant (that is, Pt = P 
for all t). 

Suppose that (H) : for every € > 0 there exist C C a distribution Q y positive 
numbers a, b y t\ y and there exists /o for every P 0 such that 

(i) aQ(S) ^ Pti(x y S) for all x d S d C 9 (ii) P t C ^ 1 一 € for all / $ /❶， 
(iii) PtS S ^Q(S) + € for all <9 <Z C and / ^ /o. 

Then (C): there exists ji unique invariant distribution P which is “ergodic" 
(that is, such that d(JP“ 尸 ）— 6 whatever be Pq): 

d) Under (H), d(Pt ， P’d — 0 as /•— oo whatever be Po, P f o> Thus, there 
exists at most one invariant distribution and when it exists, it is ergodic. 

e) Under (H), d{Pt my Pt n ) — 0 as / m , / n — oo whatever be Po and / n ^ oo. 

f) Are conditions (H) necessary for (C) to hold? 

9. Quasilejtcontinuity (Blumenthal, Hunt, Meyer). Let T = [0, oo), X t be 
(B^-measurable, (R t 丁 ，〜， t be times ofCBr and g be nonnegative measurable on 9C. 

a) (Xt } ®r) is Markovian with tr. pr. P t (x y S) <=» (P t ^ 8 (X Sy S) y (R ty 0 ^ s < /) 
are martingales. What about adding “strongly ”？ 

Let CBr) be Markovian with semi-group (T h t ^ 0). Then (g(Xt) y / ^ 0) 
is a supermartingale <^=> g is supermedian ， i.e. y g ^ T t g. 

We say that g is excessive (uniformly) if it is supermedian and Thg — g (Thg 
—> g uniformly and g is bounded); g (or Xt) is quasileftcontinuous {qlc) if r n t r 
a.s. => g(Xr n ) g{X r ) (or Xr n Xr) a.e. on (r < oo], 

b) Let (Xy, CBr) be strongly Markovian with almost all sample functions 
rightcontinuous with left limits; we can take (S>t = ®i+. 

Lemma. Let g be bounded with g{X Tf ) Y a.e. on [r < ^>) as r n ] t a.s. If 
Tkg — g uniformly、then Y = E(X r | ®r 一） where CB f ~ is the cr-field over the (B fn : 

Reduce to bounded r (replacing r n by inf (r n , /) with arbitrary t C T). Let 
<r n = sup (r n + h y t) so that, as » —> oo, P[a n 5^ r n + A] —> 0 and A 2 = 
g(Xr n ^h) - g{X 9f ) 9 ^ 0 with pr. — 0. A x = g(X rf ) - g(X Tn+ h) = T h g(Xr n ) 
and Az = g(D — g(X r ) = T 9n ^ r g{X f ) with a n - r S h converge vers zero 
uniformly in as A — 0. Thus, given € > 0, 5 C then fixing h sufficiently 

small, for all n sufficiently large, | | g(X rn ) — I g(X r ) | < €. Letting » 一 00 
then € 一 0， I y = I 《 dj; and this equality extends on (B 厂 . 

^ B ^ B 

Uniformly excessive g on qlc Xt 这 re qlc: For，lemma applies to bounded super- 
martingale g(X Tn ) and X T is (B f ~-measurable by qlc of Xt- 

If the semi-group {Tt y / ^ 0) is strongly continuous on invariant Co, then Xr is 

qlc: t t 

Reduce to compact 9C (proceeding as in 40.3d). For Y exists by exist¬ 

ence of left limits and lemma applies. If P[Y ^ X r ) > 0, there is a sphere V x {r) 
with PA = P[YC yx{r) y X f ^ Fx{2r)\ > 0. With ir C C, ^ ^ 1, ^ = 1 on 
Fx(r) y ^ = 0 on / x x c (2r), integrating Y = E(g(X f ) | on \Y V ^(r)] 
yields PA S 0-contradiction. 

c) Qlc strongly Markovian processes with almost all sample functions right- 
continuous with left limits are standard {Markovian) ir) applications to potential 
theory. Combine with 40.3 and 41.1 to find conditions for existence of equiva¬ 
lent standard Markov processes. 
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237, 385, 387 
measure, 343 
representation, 343 
Kolmogorov, 30 y 94\ 302 y 407 、 408 y 
410, 263, 237, 385, 387, 388 
approach, 145 
equation, 295 
equation, strong, 309 
inequalities, 25, 247, 275, 44 
strong law of large numbers ，251 
three series criterion, 249 
zero-one law, 241 、 66. 

Kronecker lemma, 250 
Krylov, 110 

Lambert, 46 

Langevin, 236 

Laplace, 22^ 281 ， 286 、 287 、 40Z 
Law of large numbers 

Bernoulli, 14, 26, 244 、 282, 77 
Borel strong, 18 y 26 y 244^ 11 

classical, 290 
Kolmogorov strong ，251 
Law(s), 174 
degenerate, 215、 281 
equivalence lemma, 290 
equivalent sequences, 39 
infinitely decomposable, 308 
normal, 213, 281 
of the iterated logarithm, 219 、 
249, 276 

Poisson, 282 
probability, 174、 214 
self - decomposable ，334 
stable, 326 y 363 
types of, 215 
universal, 403 
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Law(s) {Cont.) 
weighted, 44 

zero-one, 241 、 374 y 64, 233 
Lebesgue, 408 
approach, 143 
decomposition theorem, 131 
field, 129 
integral, 29 
measure, 128 
sets, 129 

Lebesgue-S tiel tj es 
field, 128 
integral, 128 
measure, 128 
Le Cam, 93, 409 、 237, 263 
extension of Donsker and Kol¬ 
mogorov theorem，286 

Note by, 286—287 
Leeuvenhoek, 239 
L6vy, P., 199 、 204, 301 、 408, 410, 
54, 63, 64, 65, 161 ， 188, 193, 
210, 237, 384, 386 
centering function, 350 
continuity theorem, 204 
demon, 188 
inequalities, 259 
function, 364 
measure, 343 
representation, 343 
Liapounov, 411 
inequality, 172 
theorem, 213, 287、 289 
Limit(s) 

along a direction, 68 

inferior, 58 

of a directed set, 68 

one - sided, of random functions, 

174,175 

sample lemma, 361 
superior, 58 
Limit of a sequence of 
functions, 113 
laws, 214 
numbers, 104 
sets, 58 


Limit problem 
central, 302 
classical, 286 
Lindeberg, 292, 411 
Line 

Borel, 93 y 107 
extended real, 104 
real, 93 y 103 
Linear 
closure, 79 
functional, 80 
mapping, 102 
mappings lemma, 79 、 102 
space, 70 

transformation, 107 
Linearly ordered, 67 
Locally compact, 71 
Lomnicki, 409 
Lower 
class, 272 
variation, 87 



completeness theorem ，163 
convergence theorem ，164 
ergodic criteria, 105 
space(s )，162 
Lukacz, 408 
Lusin’s theorem, 140 

Mappings 

bounded, 102 
linear, 102 
nonnegative, 102 
norm of, 102 

Marcinkiewicz, 225, 2S4 y 302 y 409 y 
167 

Marczewski, 385 
Markov(ian), 407 y 288, 290 
chain, 28 y 18, 31 
dependence, 18, 28 
endomorphisms ， 299, 334 
endomorphisms criterion, 335 
equivalence theorem, 289 
inequality, 160 
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Markov(ian) {Cont,) 

infinitesimal operators criterion, 
348 

process, 164, 288, 294 
process, regular, 164, 294,297 
process, stationary, 293, 302 
property, 281 

property, strong, 304, 308, 357 
sample rightcontinuity theorem, 
347 

semi-groups, 331， 335 
semi-groups criterion, 336 
semi-groups unicity theorem, 

341 

stationarity theorem，299 
time，301 

time ， stationary, 307 
time theorem, 359 
(sub)Martingale(s) ， 54,55,194, 

195 

central lemma, 200 
closure (sed) ， 54, 200 
closure theorem, 60 
convergence theorem, 59 
criterion for Brownian motion, 

244 

crossings, 57, 202 
decomposition, 55 
elementary properties, 195 
elementary closure properties, 

197 

extended central lemma, 208 
extension, 206 
inequalities, 57 5 201 
preserving transformations, 196 
reversed sequences, 62, 63 
right closure lemma，197 
right regularization, 206 
sample limits lemma, 202 
sample limits theorem, 203 
second order, 129 
sequences, 62 
stopping, 208 

strong laws of large numbers, 66 
supermartingale, 195 


(sub)Martingale(s) {Cont.) 
suprema theorem, 199 
zero - one laws, 64 
Matrices, method of ，48 
Matrix, transition probability, 
Maxima times, 376 
Maximum times, 393 
Maxwell - Boltzmann statistics, 42, 
43 

Maxwell demon, 188 
Mean 

quadratic, 248 y 122, 135 
r-th, 159 
Measurability 
lemma, 169 
theorem, 108 y 178 
Measurable 
function, 107 
random function, 168 
sample -， 167 
sets, 60 y 64 y 107 
space, 60 y 64 y 107 
transformation, 46 
Measure(s), 84^ 112 
convergence in, 116 
extension of, 88 
extension of — lemma, 194 
Lebesgue, 129 
Lebesgue—Stieltjes, 128 

normed, 90 y 151 

outer, 88 

outer extension, 89 
product, 136 
signed, 87 
space, 112 
Median, 256 
centering at, 256 
conditional, 51 
Metric 

compactness theorem, 76 
linear space, 79 
space, 73 
topology, 73 

Meyer, 193, 383, 386,388 
Miller ， 76, 385 
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Minimal 

class over, 60 
sufficient <r-field, 13 
Minimality criterion, 13 
Minkowski inequality, 158 
Mixed conditional distribution, 25 
Moment(s) 

convergence problem, 187 
convergence theorem, 186 
k-th, 157、 186 
lemma, 254 
r-th absolute, 157 、 186 
Monotone 
class, 60 

convergence theorem, 125 
sequence of sets, 58 
Montmort, 46 
Moving discontinuities, 147 
/x 0 - measurable ，88 
Multiplication 
lemma, 238 
property, 11 
rule, 24 
theorem, 238 

Mutual convergence, 74 、 103, 113 、 
153 

Nagaev, 384 

Negligibility, uniform asymptotic, 
302、 314 
conditional, 47 
Neighborhood, 66 
Nelson, 236, 386 
Neumann, von, theorem, 76, 385 
Neveu ， 289, 386 
Neyman, 407 y 11 
Nikodym, 133 y 408 
Nondecreasing sequence, 58 
Nonhereditary systems, 28 
Nonincreasing sequence, 58 
Nonrecurrent state, 31 
No return state, 31 
Norm 
Hilbert, 80 


Norm {Cont.) 
of a functional, 79 
of a mapping, 79, 102 
Normal 

approximation lemma, 241 
approximation theorem, 320 
continuous decomposable crite¬ 
ria, 223 

continuous decomposable proc¬ 
ess, 222 

convergence criterion, 307 
convergence criterion, classical ， 
292 

decomposition theorem, 
law, 213, 281 
stability theorem，151 
strongly, 128, 133 
type, 283 
weighted, 45 
Normalized 
covariance, 140 
distribution function, 199 
random function, second order ， 

140 

Normed 

functional, 80 
linear space, 79 
mapping(s), 102 
measure, 91 、 151 
sums, 331 
Norms ergodic 
lemma, 108 
theorem, 109 
Nowhere dense, 75 
Null 

preserving translation, 103, 108 
set, 91 y 112 
state, 32 

Number of jumps lemma, 226 
Numerical function, 105 

Open 

covering, 69 
set, 66 
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Operator(s) 
diffusion, 372 

extended infinitesimal, 366, 368 
中 - weak infinitesimal, 338 
one-dimensional diffusion, 374 
stationary transition, 303 
strong infinitesimal, 337 
transition, 297 
Operators criterion 
infinitesimal, 345 
Markov infinitesimal, 341 
Operators theorem 
diffusion, 375 

one-dimensional diffusion, 378 
Ordering, partial, 67 
Ornstein, 381 、 410 、 307 
and Uhlenbeck, 236 
Orthogonal 

expansion theorem, 143 
increments, 145 
random variables, 246 、 122 
Orthogonal decomposition 
elementary, 125 
harmonic，149 
proper, 144 
Outcome(s), 14 
field of, 4 

of an experiment, 4 
Outer 

extension, 89 
measure, 88 
Owen, 411 
Oxtoby，385 

Paley, 237, 387 
Parseval relation, 386 
Parzen, 407 
Perrin, 235 
Petrov, 410 
Petrovsky, 263 
Physical statistics, 42 
Planck, 44 
Poincare, 235 

recurrence theorem, 28 
Point differentiation lemma, 313 


Poisson 

compound, 347 
continuous decomposable, 273 
continuous decomposability cri¬ 
terion, 223 

convergence criterion, 229^ 329 
decomposable process, 231 
decomposition theorem, 283 
distribution of particles, 234 
law, 282 

sample jumps lemma, 224 
theorem, 15 
type, 283 

Pollaczec, 394, 396, 400, 411 
-Spitzer identity, JPJ, 400 
Pollard, 34 
Polya, 368 y 409 
Port, 369, 393, 412 
Positive 
part, 105 
state, 32 
Possible 
state(s), 370 
value(s), 370 
values theorem, 371 
Probability, 5, 8 y I6 y 91 y 151,152 
conditional, 6 y 24 、 3—7 
convergence in pr” 153 
convergence on metric spaces, 
189 、 190 
distribution, 168 
field, 8 

invariant, 39 y 99 
law, 214 

product — theorem, 92 
rule, total, 24 
stability in, 244 
stationary transition, 301 
sub ,187 

transition, 29^ 32, 90, 295 
Probability space, 91 ^ 151 、 152 
induced, 168 
product, 92 
sample, /68 y 29 
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Product 
cylinder, 62 
field, 61 y 62 

measurable space, 61 、 62、 137 
measure, 135 
measure theorem, 136 
probability, 92 
probability space, 92 
probability theorem, 92 
scalar, 80 y 122 
set, 61 

(r~field, 61 y 62 
space, 61 y 62 

Prohorov, 190 、 193 、 264 、 409, 264, 
387 

Projection, 108 
theorem, 127 
Proper 

orthogonal decomposition, 144 
subspace, 108 
value, 108 

31-definition, 164 
Rademacher inequality, 123 
Radon - Nikodym theorem, 133 
extension, 134 
Raikov, 283, 411 
Random 
analysis, 163 
event, 5, 8 
process, 164 
sequence, 152,155 
stopping, 208 

time(s), J75, 188,305 
time identities, 390 
time translations, 376 
trial, 6 

variable, 6 y 9 y 17^ 152 
vector, 152,155 
walk, 47, 378, 379 
Random function, 152 、 156 、 163, 
164 

Borel, 168 
decomposable, 212 
measurable, 168 


Random function {Cont.) 
regular Markov, 296 
separable, 171 

stationary, 169 
theorem, stationarity, 170 
Random functions laws conver¬ 
gence criterion, 268 
Range, 63 
space, 62^ 105 
Ranked 

random variables, 350 
sums, 405 
Ray, 369, 395, 412 
Real 
line, 93 

line, extended, 93 y 107 
number, 93 
number, extended, 93 
Recurrence 
criterion, 32 
theorem, 380 

Recurrent 
state, 31 y 380 
walk, 380 

Reflection principle, 260 
Regular 

conditional probability, 138 、 19, 

20 

process, 164 
q-pair, 329 
Regular Markov 
existence theorem, 297 
process, 294 
random function, 298 
Regular variation, 354 
criterion, 354 
Regularity theorem, 29 
Relative 

compactness, 190 
compactness theorem, 195 
conditional expectation, 10 

Representation theorem, 313 
integral, 166 

Reproducing kernel, 156 
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Resolvent lemma, 341 
tweak, 342 

Restriced integration theorem, 25 
Return 
criterion, 32 
state, 31 

Riemann integral, 129 
Riemann - Stieltjes integral, 128 
in quadratic mean, 138 
Riesz, F., 222, 76 
lemma, 81 
Rightcontinuous 
process, 370 

stable processes extension 
lemma, 371 

Right regularization, 206 
r - th 

absolute moment, 157 、 186 
in—mean, 159 
Ruin, gambler’s, 48 
Ryll-Nardzewski, 77, 385 

Saks, 408 
Sample, 167 
continuous, 167 
integrable, 167 
jumps lemma, Poisson, 224 
lemma, centered, 218 
lemma, discontinous part, 220 
limits lemma, 361 
measurable, 167 
probability space, 168 y 29 
rightcontinuity theorem, 
Markov, 364 
space, 167 

step functions theorem, 329 
Sample continuity, 179 
Brownian, 243 
moduli theorem, 183 
modulus, Brownian, 247 
Savage, 374 、 411 、 11, 384 
Scalar product, 80 y 122 
Scheffe, 408 

Schwarz inequality, 158 
Second category, 75 


Second order 
calculus, 185 
calculus theorem, 186 
chain, 130 

integral decomposition, 232 
martingale, 129 
random function, 131 
random variable, 121 
. stationarity, 148 
Sections of 
sets, d/ y 62 y 135 
functions, 61 y 62 y 135 
Self-decomposable (bility), 334 
criterion, 335 
Semi - group(s) 

Borelian, 336 
convergence lemma, 344 
criterion, Markov, 336 
exponential, 339 
Markov, 331 
property, 302, 335 
property, generalized, 295 
strongly convergent, 343 
Separability, 170 
almost sure, 173 
criteria, 171 
existence theorem, 173 
lemma, 172 

Separable 

random function, 171 
<r~field, 21 
space, 72 

Separating sets, 171, 174 
Separation theorem, continuity, 
176 

Sequences 

convergence equivalent, 245 
(sub)martingales, 62 
random, 152, 155 
reversed (sub)martingales, 62, 

63 

stationary, 83 

tail of, 241 

tail equivalent, 245 



Sequences of laws 

completely equivalent, 39 
weakly equivalent，39 
Series criterion 
three, 249 
two, 263 

Set(s) 

Borel, 93 y 104 
bounded, 74 
closed, 66 
compact ，69 
dense, 72 
directed, 68 
empty, 4^ 54 
Lebesgue, 129 
measurable, 60 y 64 y 107 
null, 91,112 
open, 66 
product, 61 
separating, 171， 174 
subdirected, 69 
totally bounded ，75 
Set function 
additive, 83 
continuous, 85 
countably additive ，83 
finite, 82, 111 
finitely additive, 83 
o •- additive ， 83 y 111 
(r-finite, 83, 111 
Sevastianov, 395 
Shohat ，187 
<r-additive 5 81 y 111 
(r-field(s), 59 
chained, 18 
compound, 156、 235 
independent, 236 
induced ，64 
invariant, 79 
minimal sufficient, 13 
product, 61 、 62 
separable，21 
sufficient，11 
tail, 241 
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<r-uniform continuity, 318 
Signed measure ，87 
Simple 

discontinuity, 176, 318 
function, 64 y 107 
random variable, 6 y 152 
Skorohod,264,272,275,386 
Brownian embedding, 271 
Gikhman and, 386 
Slutsky, 130, 386 
Snell, 384 
Space 

adjoint, 81 

Banach, 79 
Borel, 93,107 

C, 343 
Co, 347 
C uy 347 
compact ，69 
complete metric, 74 

D, 275 

HausdorfF, 68 
Hilbert, 80 

induced probability, 168 
linear, 79 

measurable, 60 y 64 y 107 
measure ，112 
metric, 173 
metric linear ，79 
normal, 78 
normed linear, 79 
probability, 9/, 151 、 152 
product, 61 y 62 

product measurable, 61 、 62 y 137 
product measure, 136 
product probability, 91 
range, 62、 105 
sample probability, 168 
separated, 68 
of sets, 55 
topological ，66 
Sphere, 73 

Spitzer ， 369, 393, 394, 404, 410, 
412 

basic identity, 396 
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Spitzer {Cont.) 
basic theorem, 401 
Pollaczec — identity, 393 y 400 

Stability 

almost sure, 244 
almost sure convergence and, 
124 

almost sure criterion ，264 
and attraction criterion, 364 
in quadratic mean ， 122, 124 
in probability, 244 y 246 
normal—*theorem, 151 
theorem, 53 

weak — including infinity, 350 
weak — uniform at infinity, 353 
Stable 

characteristic function, 338 
law, 338 y 363 
process, 350 

process, continuous, 372 
rightcontinuous extension 
lemma，371 
State(s) 

absorbing，316 

closed class of, 36 

equivalent, 36 

ever re turn, 36 

indecomposable class of ，36 

instantaneous, 316 

nonrecurrent, 31 y 380 

noreturn, 31 

null, 32 

period of ，33 

positive, 32 

possible ，370 

recurrent, 31 

return, 31 

sets, uniformly continuous, 314 
space, 165, 289 
steady, 314 
transient, 380 
Stationarity, 301 
and law-derivatives, 225 
inequalities, 84 


Stationarity {Cont.) 
integral, 83 

integral — theorem ， 83, 85 
lemma, 84 
second order, 148 
theorem, 87 

theorems ， Markov ， 302, 303 
theorem，random functions，170 
Stationary 

chain ， 如 ， 33 
covariance，148 
decomposability criterion, 226 
Markov process ， 302, 307 
Markov time, 307 
process, 225, 303 
random function, 169, 303 
sequence，83 
transition operator, 302 
transition probability, 29 、 301 
Steinhaus, 409 、 411 、 384 
Banach - ， 102, 332 
Kaczmarz and —， 384 
Step functions ， sample—theorem, 
329 

Stochastic 

independence, 6 y 11 
process, 164 
variable, 174 
Stone, 131 
Stopping, 208 
Strisower, 387 
Strong 

continuity lemma, 336 
closure, 107 

convergence, 106 
convergence in G ， 329, 333 
differentiation lemma, 337 
differentiation operator，337 

generator, 337 
infinitesimal operator, 337 

Kolmogorov equation, 309 
normality, 128 
resolvent, 340 
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Strong laws of large numbers, 66, 
233 

Borel, 18,19, 26, 244, 77 
Kolmogorov, 241 
Strong Markov 

equivalence theorem, 309 
property, 304, 308, 357 
property theorem, 357 
Strongly 
compact, 107 
continuous, 332 
convergent semi-groups, 343 
differentiable, 332 
integrable, 332 
normal, 128 
quasi-compact, 110 
Structure theorem, 310 
Submartingale, see 

(sub)Martingale(s) 
Supermartingale, 195 
Subspace 
closed linear, 126 
invariant Banach, 336 
linear, 79 
proper, 98 
topological, 66 

Sufficient <r - field, 11 
minimal, 13 

Sums of sets, 4^ 51 
Superior limit, 58 
Supremum ， 56、 103 
Sure 

almost, 151 
event, 14^ 151 
Symmetrization, 257 
inequalities, 259 
inequalities, weak, 257 

Tail 

equivalence, 245 
event, 241 
function, 241 
of a sequence, 241 
(r-field, 241 


Tanaka, 388 
Tchebichev ，409 
inequality, 11,60 
theorem, 287 

Three-parts decomposition 
theorem, 216 

Three-series criterion, 249 
Tight (ness), 194 
criterion (in C)，267 
lemma, 194 

and relative compactness ，195 
theorem, 194 
Time(s) 

(B n -, 376 
189 

elementary properties of ， 
190 

Brownian, 254 
degenerate, 376, 189 
elementary, 190, 308 
fiest exit, 255 
(fitst)hitting, 256, 396 
jump, 369 
Markov, 307 
Markov, stationary, 307 
martingale, 200 
maxima, and positive sums 
identities, 391 

maximum, and value identity ， 
393 

random, 376 、 188, 305 
random — identities, 390 
random—translations, 376 
stopping, 208 

JCr-y 306 
{Xty 191 

Time-continuous transition prob¬ 
abilities, 310 
Toeplitz lemma, 250 
Topological 
space, 66 
subspace, 66 
Topology 
metric, 73 
reduced, 66 
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Total(ly) 
bounded set, 75 
probability rule, 24 
variation, 87 

Trajectories, 166 
Transformation(s) 

proper subspace of, 108 
proper value of, 108 
quasi-strongly compact, 107 ， 
110 

quasi-weakly compact, 107 ， 

110 

Transition law derivatives, 299 
existence criterion, 299 
theorem, 300 
Transition operators, 295 
stationary, 302 

Transition probability(ties), 29 y 
31, 32, 90, 295 
differentiation theorem, 317 
differentiation theorem, 
extended, 319, 320 
invariant, theorem, 91 
singular, lemma, 93 
singular vanishing, theorem, 94 
stationary, 302 
time’continuous，310 
Translate(s), 79, 301 
Translation(s),J7d, 71,96 
extension of, 97 
invariance under, 79 
null-preserving, 101, 106 
random times, 376 
Trial(s) 

deterministic, 5 
identical, 5, 6 
independent, 5, 6 
random, 6 
repeated, 5, 6 
Triangle property, 73 
Triangular 

characteristic function, 386 
probability density, 386 


Truncation 

inequalities for characteristic 
functions, 209 
of random variables, 244 
Tucker, 410 
Tukey, 410 
Tulcea, 138、 408 
Two-series criterion, 251 
Type(s), 215 
convergence of，216 
degenerate, 215、 282 
normal, 282 
Poisson, 282 

Ueno, 384 
Ugakawa, 102 
Unconditional 
centering, 215 
convergence, 215 
Unicity 

lemma, 伞 -weak, 341 
theorem, $-weak, 342 
Uniform 

asymptotic negligibility, 302 y 
314 

convergence, 114 
convergence theorem, 204 
ergodic theorem, 113 
Uniform continuity, 77 

and differentiation lemma, 339 
condition, 317 
lemma, 180 
state sets, 314 
Union, 4 y 56 
Upper 
class, 272 
variation, 87 
Urysohn, 78 
Uspensky, 407 

Value(s), possible, 370 
theorem, 371 

Variable, random, 6 y 9 y 17 、 152 
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Variance(s), J2 y 244 
case of bounded, 302 
case of bounded — limit theorem, 
305 

Variation 
lower, 87 

of truncated moments, 359 

regular, 354 

slow, 354 

total, 87 

upper, 87 

Vector, random, 152、 155 
Ville, 54, 384 

Wald’s relation(s), 377 y 397 y 2S7 

Wax, 236, 386 

Weak 

closure, 107 
compactness, 110 
compactness theorem, 181 


Weak {Cont.) 
convergence, 180, 106 
equivalence, 39 
equivalence criterion, 40 
stability, 350, 353 
symmetrizatlon inequalities, 257 
Weierstrass theorem, 51 
Wendel, 405, 412 
Wiener, 212, 237, 387 
Wold, 384 

Yosida, 106, 288, 344 
Yushkevitch, 289, 310, 357, 388 ， 
389 

Zero-one 

criterion, Borel, 24 
law, Hewitt-Savage, 374 
law, Kolmogorov, 241 
Zygmund, 408 、 237 
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CONDITIONING 




Proof. The asserted equality is true if g = Ia» is the indicator of a set 
A f C for, setting yf = y 一 穴 /’)， we have g(Y) = 7^ and, hence, 




I A, dP f Y = P ， B，）= Py(JB) = / A 


Being true for indicators, the equality is true for simple functions g 
and, by the monotone convergence theorem, for nonnegative measur¬ 
able g. The assertion follows upon decomposing a measurable g into its 
positive and negative parts. 

We are now in a position to prove the above-stated property. 

A. The c.exp. of X Z given Y is a junction of the function Y. 

Proof• If ^ is the indefinite integral of X and (p f on is defined by 

= ^(5), B= Y-KB'), 

then ip is (r-additive and P’y-continuous，the extended Radon-Nikodym 
theorem applies to (p ; and P f y and defines a measurable function g on 

(12’ ， «V) by 



gdP f Y = 八 B，) 


Since 


[xdP. 

Jb 


[xdp = r (e y x) dp y 

Jb Jb 


it follows, upon applying a, that 



g( YUP y= f B ， y= f ( ErxUP ， 


so that the indefinite integrals of the (B 卜 measurable functions g(Y) and 
E y X are the same, and the assertion is proved. 

As defined, E Y X is a y-measurable function on the original space 
(n, d y P). However, the usual interpretation of the c.exp. of X given 
Y is that it is the function g on defined by 


f gdP f Y = f XdP y B f C B = y 一 
Jb / Jb 

We prefer to consider c.exp.’s as functions on the original pr. space. 
Yet, on account of the foregoing theorem, both interpretations are pos¬ 
sible: either E Y X\s considered as a Junction of the Junction Y with values 
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orthonormalized proper functions of the covariance r(/，/’）of X{t) y 
and form the integrals 

= J 聊 n (林 

These integrals exist, since X{t) (in q.m.) and 沴 n (/) are continuous on 
the closed interval /， and 

= S mn ， EX{tj^ n = X n ^nW* 

It follows from Mercer’s theorem that E\ X{f) — X n {t) | 2 —> 0 uni¬ 
formly on I y and the “if” assertion is proved. 

In physics, the most important orthogonal decomposition is the 
harmonic one, for, loosely speaking, it yields “amplitudes,” and hence 
“energies,” corresponding to the various parts of the “spectrum” of 
the random function, and we seek it now. But, first, we have to intro¬ 
duce random functions which correspond to sums of orthogonal r.v/s. 
It will be convenient to denote the increment of a function, say 专 (/), 
on an interval [a y b) by ^[a y b) = 认 b 、 一 i{a). Increment functions are 
characterized by their additivity: 

+ c) = c) 

and determine the point functions f(/) up to additive quantities. While 
what follows is valid for more general ordered sets T y we shall assume, 
to simplify the language, that T (Z R. 

A second order random function 专 (/) has orthogonal increments if, 
for disjoint intervals [a y b) y \a\ b f ) y 

E^[a y b)l[a\ b f ) = 0. 

Then 

E\ ^[a y b) ± b f ) I 2 = E\ ^[a y b) | 2 + E\ ^{a\ b f ) \\ 

and it follows, by setting, for some fixed A, 

五 Uk /) I 2 = F(t) y E\^[t y a)\ 2 = 一 nfu 

that 

E\ 钉 /,〆 ) 卜 F[/,，) ; 

for short, 

e\ dm i 2 = dFify 


